The Design of Software (CLOSED)

A public forum for discussing the design of software, from the user interface to the code architecture. Now closed.

The "Design of Software" discussion group has been merged with the main Joel on Software discussion group.

The archives will remain online indefinitely.

SaaS multi-companies database design

Hi,

I’m designing a web application to be offered as SaaS to multiple companies.

I need your feedback on the technical challenge of designing the database and therefore the application.

What is the best approach?

  • Create a database for each customer or,
  • Create one database for all customers and use a customerID on each table.

I went thru this BOS http://discuss.joelonsoftware.com/default.asp?design.4.400013.13 post but I still have questions:

If you go with the first option and create a database for each customer, then you are ending up with – hopefully – thousands of databases on your server(s).

How you manage updates?
Even a small change in a stored procedure must be replicated in thousands of databases.

What tools you use to do it?

What about hosting? If you have your own dedicated server, then you can make changes on the spot. What about shared/virtual hosting where you only get a web based SQL admin tool?

If you go with the second option and create one database for all customers, you ending up with one database but you must distribute the load in multiple servers and have a few customers on each.
Updates are much easier, but you may have to deal with performance issues.

Do you know how FogBugz does it?

Thanks.
MicroISV Wannabe
Wednesday, March 26, 2008
 
 
If you design a database to handle multiple customers, you can still have multiple instances of it if needed for distributing load - e.g. first 10,000 customers on instance 1, next 10,000 on instance 2, ...
Mike Stockdale Send private email
Wednesday, March 26, 2008
 
 
Thanks John,

Those threads didn't come out at the search.
MicroISV Wannabe
Wednesday, March 26, 2008
 
 
"Even a small change in a stored procedure must be replicated in thousands of databases."

That's not quite true, if you're using SQL Server. You can call a stored procedure in another database like this;

EXEC [databasename].[dbo].[procedurename]

"If you go with the second option and create one database for all customers,"

You might want to create a database per ten customers, distributing the payload over different servers.


Happy programming :)
Eddy Vluggen Send private email
Wednesday, March 26, 2008
 
 
Eddy,

I like the idea of calling a stored procedure from a "master" database which holds only the sps for all databases.
If a customer needs some custom work I can add sps to his db only.

There are any performance issues on doing this?

About the second part of you post:

This is exactly what I mean by saying:
"If you go with the second option and create one database for all customers, you ending up with one database but you must distribute the load in multiple servers and have a few customers on each."
MicroISV Wannabe
Wednesday, March 26, 2008
 
 
Eddy,

How you will form your stored procedure to access tables in different databases?

For example:
If I create a "master" database and store all sps there and then each customer has it's own database, the sps will need to form the query dynamically to access the tables in the format database.tablename.
MicroISV Wannabe
Wednesday, March 26, 2008
 
 
If the amount of data per customer is large, you might want to have one database per customer.

1) It is easier to set up "needs only" access to a particular customer's data if each is in a separate database. The security is much easier to understand and audit.

2) It keeps single-customer changes from having any chance of affecting other clients. It compartmentalizes changes.

3) It makes it easier to load balance. If a particular customer gets too big for a server, it is much, much easier to move the customer to another server if the data is separate.

4) If you no longer have the client, it is much, much easier to decommision the data if each customer is in a separate database.
somebody else
Wednesday, March 26, 2008
 
 
Note that some big clients get cranky if their data is merged with other customers.
somebody else
Wednesday, March 26, 2008
 
 
+10 to somebody else

For all these reasons you should generally use separate databases, not intermingle lots of customer's data within one database.

Don't worry about the maintenance issues. If you can write a script to apply a change, it is trivial to extend it to apply to more than one database. It is then trivial to allow you to selectively apply to sub-sets of database, if you ever need to do this.

As a customer, I'd be pretty nervous if my data was stored in the same database as someone else's, perhaps a competitor. Could you really 100% guarantee that we will never see each other's data, under any circumstances? I don't mean 99.9% certainty, I mean 100%.

Also, can you guarantee that actions, malicious or otherwise, by another customer will have no impact on my data? In reality, this is harder to be certain of, but if you use separate databases it is a lot easier to make a good case.

I suppose, if you are stuck with cheap hosting that charges per database, then you might want to mix the data, but I would say it is better to change host than do that.

Apart from anything else, it just seems so natural to have  www.mySaaS.com/client1, or client1.mySaaS.com map to the client1 database.

Although, you may have to be wary of war dialling opportunities if you truly go for such a trivial mapping, of course. Whether or not this is an issue for you will depend upon your specific application.
Odysseus Send private email
Thursday, March 27, 2008
 
 
"How you will form your stored procedure to access tables in different databases?"

Run from the master-database, querying two other databases located on the same server;

CREATE PROCEDURE TEST
AS
BEGIN
    SELECT FirstName, LastName
      FROM [Northwind].[dbo].[Employees]
    UNION
    SELECT FName, LName
      FROM [pubs].[dbo].[Employee]
END
GO

Odysseus has a point; security might be an issue. It might even be a problem for some customers if you mix the data, depending on the sensitivity of the data and the paranoia of the users/managers involved.

Read-only data may be tucked away safely in a shared database, as well as shared stored procedures. The customer-specific data could have it's own (encrypted?) database.

Note that encryption affects performance. The higher the performanceload, the higher the cost.

Good luck :)
Eddy Vluggen Send private email
Thursday, March 27, 2008
 
 
Odysseus, Eddy,

You convinced me to follow the "one database per client" model.

I don't believe is better, but is definitely the lesser of two evils.

Now, another question:

Which web hosing companies you think are good for such model?
Your experiences?

Thanks.
MicroISV Wannabe
Thursday, March 27, 2008
 
 
"I don't believe is better, but is definitely the lesser of two evils."

This seems like a contradiction in terms, although it is just a semantic thing.

I believe that the minor disadvantages are vastly outweighed by the major advantages. Of course, this is all IMHO, YMMV, etc.
Odysseus Send private email
Thursday, March 27, 2008
 
 
@Eddy -
> CREATE PROCEDURE TEST
> AS
> BEGIN
>    SELECT FirstName, LastName
>      FROM [Northwind].[dbo].[Employees]
>    UNION
>    SELECT FName, LName
>      FROM [pubs].[dbo].[Employee]
> END
> GO

I don't think this works with the scenario he is talking about.  He isn't talking about known DBs.  He has thousands (hopefully) of databases with the same tables.  You reference Northwind and pubs above when it would really be "Client1", "Client2", etc and really there wouldn't be a union, it would have to be some dynamic way of saying "this is the database I want"

I'm guessing that you were more showing how to access multiple databases from a stored procedure in an outside database.

The op should know that he will have to have dynamic code inside these master SPs.  If you're doing that, you might as well just have dynamic SQL and not mess with the SPs.  (I like SPs, so I'm not advocating that though). 

Is that confusing enough??  :)
Crazy Eddie
Thursday, March 27, 2008
 
 
Eddie,

This is exactly the point I've made to Eddy.

I have to create the sp SQL dynamically every time and call sp_execute, which is not so good with performance.

I don't see any better way to do this, than replicating the sps on each database.

Maybe someone can point me to a good tool that does the synchronization.
MicroISV Wannabe
Thursday, March 27, 2008
 
 
"The op should know that he will have to have dynamic code inside these master SPs."

Nah. You don't want dynamic code. Each database should be independent. You can compile the same procedure if the objects it references are relative (ie, not qualified). Or, if you want them qualified, you could do some preprocessor-type thing.
somebody else
Thursday, March 27, 2008
 
 
"I don't see any better way to do this, than replicating the sps on each database."

If you design the database and the procs "correctly", this is not hard to do (see earlier message).

We have the base proc code in source control and compile it across many databases.

Replicating procs isn't a big deal because you aren't (or shouldn't be) doing it that frequently. It would be pretty simple to automate this.
somebody else
Thursday, March 27, 2008
 
 
"It would be pretty simple to automate this."

Yeah, that is what I mean earlier. If you script a stored procedure, you can easily wrap it in a loop to apply to >1 database.
Odysseus Send private email
Friday, March 28, 2008
 
 
@Crazy;
You're right :)
Eddy Vluggen Send private email
Friday, March 28, 2008
 
 
>> Yeah, that is what I mean earlier. If you script a stored procedure, you can easily wrap it in a loop to apply to >1 database.

Odysseus,

Any ideas where I can find such script/tool?
Does something like that exists on the market, or should I quit my idea and build one instead?
MicroISV Wannabe
Friday, March 28, 2008
 
 
You should just write it yourself.

If you look at the system tables, you can get the name of all the databases, then write a script to write the actual script you want. So you get all the database, then you loop through them and fire off the update script for each one.

This wtiting a script to write the script trick is something I learnt from a DBA many, many years ago and has come in handy many times.
Odysseus Send private email
Friday, March 28, 2008
 
 
"You should just write it yourself."

Yes, it really isn't hard. Once you've started to do this, you might find that you can knock them out quickly.

"If you look at the system tables, you can get the name of all the databases, then write a script to write the actual script you want. So you get all the database, then you loop through them and fire off the update script for each one."

It doesn't even need to be that complicated. You could have a list of databases in some file and iterate through that.

Keep in mind that any script you write should have an ability not to use every database on the server. You want the ability to push things out a database or a few databases at a time.
somebody else
Friday, March 28, 2008
 
 

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics
 
Powered by FogBugz