The Design of Software (CLOSED)

A public forum for discussing the design of software, from the user interface to the code architecture. Now closed.

The "Design of Software" discussion group has been merged with the main Joel on Software discussion group.

The archives will remain online indefinitely.

One or multiple databases?

Hi all,

I have been reading these forums for quite a while now but this is my first post.

I am about to start designing a new web application.
The application should be shared by several customers and the question is: Should each of the customers have their own database or should they share one?
I have found some pros and cons for both alternatives.
1. Own database:
Pros: Performance is expected to be better with own databases. And it would be easier to move a customer that use the system a lot to a separate set of servers. It is also fault tolerant, since only their own data exist in the database. This alternative is also a lot easier to develop.
Cons: Maintenance will be harder. I expect the number of customers to be a couple of hundred within a couple of years and maintaining 100 databases instead of just one is  a nightmare.

2. Shared database.
Pros: Maintenance. It is easier to maintain only one database instead of several. It is also a bit easier to collect statistics about the usage of the application if the database is shared. Administrator application is easier to develop as well.
Cons: Harder to develop. Need to keep track of which customer is currently running. Less fault tolerant. If I introduce a bug here, customers might see other customers data. Perhaps less efficient due to database locking. Harder to move a customer to a separate set of servers if one customer is using much of the server resources.

I am leaning towards shared database due to maintenance despite the fact that it is probably harder to develop. I will do the maintenance myself to begin with.
What do you say?
Roger Send private email
Monday, December 11, 2006
 
 
*myName
Monday, December 11, 2006
 
 
Single database, single set of tables with data differentiated by customer_id. For an enterprise-strength system you'd consider partitioning by customer id also.

This will give you the best performance/price ratio, and the security ought to be manageable either through views on top of the data tables or through a virtual private database solution (like Oracle provides).
David Aldridge Send private email
Monday, December 11, 2006
 
 
Design for a single database if possible. You can always deploy multiple databases with such a design if you end up wanting to. Conversely, retrofitting support for multiple customers in a single DB later on will be hard.
clcr
Monday, December 11, 2006
 
 
Build it multi-tenant, but allow tenants to be put in a database/server of your choice.  Then you can move a big tenant to their own database if you want or forklift data around if you so desire.  The configuration of server/database for a particular tenant can come from a master database server, Active Directory, and XML file or whatever.

If you use this technique, you will not be able to use unique autonumber columns and expect to merge databases.
Cade Roux Send private email
Monday, December 11, 2006
 
 
And not to mention with a shared plan, you can still pull out a single customer's information if they wish to leave.  I don't see that as much of a con, considering the enormous hassle of maintaining a database for each customer.
Ejustice Send private email
Monday, December 11, 2006
 
 
Phil C Send private email
Monday, December 11, 2006
 
 
Shared database is great until one of your customers wants customization. Then you are stuck trying to figure out how to give them some special column they want without impacting everyone else.

But that is just one con.
anon
Monday, December 11, 2006
 
 
If one customer needs a schema change (and you have to/want to accommodate it without migrating all customers to the new version), then forklift the data out into a separate database, upgrade them first.

Migrate the rest later.

Building multi-tenant now will give you flexibility in your code from the get go.

Normally, I do not recommend over-designing a system.  But you have to look at likelihoods and costs.  Building a system which is not multi-tenant and then having to change (and test) the system later is very expensive.  Doing it at the beginning and testing that way is not as expensive in the long run and the likelihood of such a change being determined to be necessary for maintenance may be determined to be quite likely.
Cade Roux Send private email
Monday, December 11, 2006
 
 
Thank you all for your valuable comments and pointers.
I was not aware of the term "multi tenant" before and now I am able to search for more information about that.
Roger Send private email
Tuesday, December 12, 2006
 
 
Ask your customer how they feel about having their data commingled? Most won't like it.
son of parnas
Tuesday, December 12, 2006
 
 
Good point son of parnas.

You also can't set fine grained security on the database with comingled data. For example, you can't give some users access to a table but not others. So every customer has access to each other's data unless you SUCCESSFULLY code to keep them out of it. It only takes one simple SQL injection flaw to allow customers to see each others data.

So if you intend to use one database, what is your plan for ensuring that it is IMPOSSIBLE for one customer to view another customer's data? If you think about how hard this actually is you will end up opting for more than one database.
dood mcdoogle
Tuesday, December 12, 2006
 
 
Of course the customer would not like it. That's why you DONT ask. You don't let your customer make these decisions for you, especially since customers are usually clueless in technical matters, or at least lack the detailed knowledge of your business.
Matthias W. Send private email
Tuesday, December 12, 2006
 
 
Well Mathias, the "customers" involved in the latest case of identity theft at a university may not be technical, but they have a vested interest in knowing that their data is secure.

http://www.msnbc.msn.com/id/16169453/

They don't come right out and say it but "a sophisticated hacker found and exploited a subtle vulnerability in one of hundreds of applications" usually means a SQL injection vulnerability. So if UCLA can't get it right, what makes the OP think he can do any better?

Even if you don't tell your customers up front, they are going to find out anyway when the sh$% hits the fan. And when they do they are going to sue your butt. I hope you have errors and omissions insurance if you are going to take that attitude about security!
dood mcdoogle
Tuesday, December 12, 2006
 
 
"Well Mathias, the "customers" involved in the latest case of identity theft at a university may not be technical, but they have a vested interest in knowing that their data is secure."

What's your point? 800,000 databases?

Invulnerability to SQL injection is not rocket science, and nor is architecting a robust VPD.
David Aldridge Send private email
Wednesday, December 13, 2006
 
 
"What's your point? 800,000 databases?"

No, clearly it is not. My point is that hiding security decisions from your customers is not a good idea. They have a vested interest in security as well. In fact, they will end up paying the price when your security fails. Now I don't expect you to go ask 800,000 people about security policies. But making decisions that you KNOW will put them at risk simply because you don't think that they will either care or want to know about it is not doing anybody any favors.

The original OP likely has a few small customers. I don't think that it is unreasonable to ask their opinion about how their data storage ends up being implemented. After all, when things fail they will find out anyway. And you just might learn a little something along the way.
dood mcdoogle
Wednesday, December 13, 2006
 
 
"I don't think that it is unreasonable to ask their opinion about how their data storage ends up being implemented."

When you approach them with the problem that you don't feel competent to ensure that their data is secure from other customers then I think the multiple-customer question will be rendered moot pretty quckly.
David Aldridge Send private email
Wednesday, December 13, 2006
 
 
We have a web based service where we have hundreds of customers with the data in the same database.  Most tables have a clustered index on customer_id which greatly helps performance because all of your queries should have customer_id in the joins/where clause.
An Non E Mos Send private email
Thursday, December 21, 2006
 
 

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics
 
Powered by FogBugz