The Design of Software (CLOSED)

A public forum for discussing the design of software, from the user interface to the code architecture. Now closed.

The "Design of Software" discussion group has been merged with the main Joel on Software discussion group.

The archives will remain online indefinitely.

Need advice on database design issue


My R&D team is in the midst of reengineering a desktop database app to run on the web.  Currently, each of our clients have their own copy of the app and database at their location.  We're going to be using MySQL/RoR for the web-based app, and I have a "What would you do?" type question regarding the DB design.

I'm of the opinion that we should have one MySQL database that all our clients will be using.  Each of the tables will of course have a company_id column that will be used to identify a client's specific data, and all queries will specify the company_id column so that there's no possibility of one client seeing another client's data.  Pretty basic stuff.

My colleague, on the other hand, thinks we should setup a separate MySQL db for every client ie. 1000 clients = 1000 db's.

What would you do?  I'm concerned about scalability, maintenance and security.  I have a feeling that applications, such as 37 Signals BaseCamp, or even Amazon, don't create a new database for every client.

Friday, May 19, 2006
We do the "one DB per client" approach. Our competitors all do the "one huge DB" approach.

There are two major one-DB-per-client advantages:

1) Many clients think they would like to write their own reports and possibly full-blown apps against the DB (about 10% of them actually do it, but hey, whatever sells). This is obviously possible either way but easier and safer with separate DB's.

2) When we have operational problems, which is often enough that this is a major consideration, they generally affect only one customer at a time.

There are other benefits such as - some customers like to pay for their own hardware, you can tune customers independently, etc.

The drawbacks are obviously mostly in the setup and administration/deployment area. We have 4-5 people who do nothing but copy code around and make sure it all gets installed properly. Depending on your situation there might be big licence implications as well (maybe not for MySQL and ROR though).

Anyways, if it were me, I would base it mostly on the number and size of potential customers. Small # of big customers = one each, large # of small customers = shared.
NetFreak Send private email
Friday, May 19, 2006
"My colleague, on the other hand, thinks we should setup a separate MySQL db for every client ie. 1000 clients = 1000 db's."

The performance of this will be horrible.  Database are optimized to handle millions of rows efficiently they aren't designed to handle thousands of databases efficiently. 

I would suspect that MySQL has a hard (and low) limit on the number of open databases.  With a database-per-user you'd hit this limit early; meaning you couldn't have a large number of users logged in at once.

Furthermore, one database per user is a massive maintainance headache.
Almost H. Anonymous Send private email
Friday, May 19, 2006
This depends on the number of separate clients you have, and the size of their databases.  If you really have 1000 clients, and those databases are relatively small, then separate DBs are probably a bad idea.

However, if you have a limited number of large clients, then it makes sense to split things up.  My company takes this approach, but we only have a dozen or so hosted clients.
Timothy Mark
Friday, May 19, 2006
You can actually have it both ways - code as for the single DB solution, but you can aggregate users according to sensible criteria: Size, Geography, Customization

This way you can can start with one DB, but forklift out data when a client needs to be split out.

Then you can get to the point where you have multiple DBs and multiple DB servers relatively easily (although a single customer or group may always be in the same DB and same server).

(One side-effect of this approach is that you should avoid simple autonumber columns if you plan on doing significant forklifting - because merges might result in duplication.)
Cade Roux Send private email
Friday, May 19, 2006
I agree with Cade Roux. What you are asking about is generally referred to as "Multitenancy". A multitenant database is one where a single DB supports multiple customers (or 'tenants').

Here's the key insight: if you design your schema for multitenancy, then you get the best of all worlds: you can have a single DB for all your clients, *or* you can have one for each, *or* you can have everyone in one DB and that one pesky client who wants to run their own hardware in a separate DB.

On a side note, 80% of my projects this year have been to enable a database-backed web application for multitenancy. That's right, *12* separate client engagements in Jan - May, all because the original architect did not design the DB w/ multiple tenants in mind. This seems to be a very pervasive issue in the industry today.
PWills Send private email
Friday, May 19, 2006
Here is another point.

In a future, you might want to have some sort of CRM for your company to manage your clients. Gather some statistics, etc. Weight time you will need to develop and maintain this using one db for all data and one db per client.

Denis Krukovsky
Denis Krukovsky Send private email
Friday, May 19, 2006
We used the one DB per client approach (SQL Server 2000).  It's benefits were:

1. You could bring the DB down from production to replicate a problem they were having.
2. When a client leaves you, it's much easier to cleanup after sending them a data dump.
3. Significantly less chance of client data getting intermingled if some app developer forgot to include a "WHERE ClientId = @ClientId"
Example Send private email
Friday, May 19, 2006
Forgot one more:

4. When the business grows to where you have more clients than one physical DB server can handle, you can spread them across a new server.  This also means you can load-balance clients across multiple servers if you have a few large clients and many small ones.
Example Send private email
Friday, May 19, 2006
And, gathering statistics data on one for all / one per client, you might want to know the number of clients served, as pointed by Timothy. Things can be different if you have 2 clients or if you have 2000.

Denis Krukovsky - forum on development
Denis Krukovsky Send private email
Friday, May 19, 2006
Example, how many clients/databases did you have?
Almost H. Anonymous Send private email
Friday, May 19, 2006
Just make sure you use multiple databases and not tables or you'll end up on the The Daily WTF:
Almost H. Anonymous Send private email
Friday, May 19, 2006

Thanks for your input.  Another question: anyone know the max number of databases MySQL 4.1.x can have open?  I couldn't find this answer in the online MySQL docs.
Monday, May 22, 2006
I agree with PWills comments whole heartedly.  I write accounting/erp software and sell to large & small companies.

One of the features of the software is that it can handle multiple companies.  Behind the scenes, we use a single database for all companies; making the user choose a company at login if they have such access/knowledge, etc...

At any time, if a customer wishes to move one of his compnies to dedicated server hardware, it's etremely easy to export/inport the data.  And with table spaces, it's also really easy to put multitennants in a single DB cluster with data residing on their own set of hard disk spindles.
Tuesday, May 23, 2006
Clinical trial apps create a new database for each client, with different tables, the same tables with different columns, the same columns with differing sizes, etc. You need a good metadata repository to manage that, if it is what is truly needed.
Steve Hirsch Send private email
Tuesday, May 23, 2006

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics
Powered by FogBugz