The Design of Software (CLOSED)

A public forum for discussing the design of software, from the user interface to the code architecture. Now closed.

The "Design of Software" discussion group has been merged with the main Joel on Software discussion group.

The archives will remain online indefinitely.

Best Practices for Managing Database DDL Files

We are having some internal debate over how manage our database schema (DDL scripts). Is it best to use a single file to store all of our Database DDL scripts, or should use one file per database object.

We are using SQL Server 2005, and need to be able to store the DDL files in source control.

We need the capabilities to drop and recreate the database without preserving any of the data. We also would like the option of preserving the existing data and upgrading to the next version of the schema.

Does anyone have any advice or best practices that they have used in the past.
Looking for Advice
Wednesday, April 04, 2007
Add an "if exists drop" for each of the tables immediately before the table.  If you need to recreate individual tables, keep individual files.  If you do all the tables at one time, keep one big file.  Either way, they should all be in CVS/SVN.
KC Send private email
Wednesday, April 04, 2007
I've been doing this for a long time and I can tell you that no matter what you do it will be painful. Both approaches have their pro's and con's.

Keeping it all in a single file makes it easy to create a complete database build script. But it is rougher to manage with multiple programmers because everyone is always trying to get at the file at the same time. Also, to do an incremental update on an existing database you end up having to rip out that one little portion at the end that you just changed anyway unless you make sure that your script can always be run more than once on the same DB. This is definitely doable but takes a lot of discipline. It only takes one guy to blow the whole script simply because he forgot to check for the existence of a column before adding it again.

Keeping them in smaller files is nicer from a source control perspective but requires you to either concatenate them all together into a single script or figure out how to exeucte them sequentially when building a new database from scratch.

So individual files favor incremental updates to databases where one large file favors building databases from scratch.

I'm working on a new project and I've decided to take a different approach. I'm keeping all "DDL" in code. I have a program that contains all of the updates needed as C# code. I have special methods that I've created to do simple things like create columns, add/drop tables, and other common tasks. So instead of actually writing SQL you call a method like addColumn("tablename", "columnname", dataType). The benefits of this approach are:

1) It integrates easily into source control. You are writing C# code and not SQL so it is a more natural fit. The code is part of the project so you aren't trying to maintain two separate workspaces all the time.

2) It can be made to work for multiple database vendors. You call generic addColumn and addTable methods instead of raw SQL. If there are differences between vendors you can encapsulate this in the code that executes the call. For example, calling addColumn("mytable","salary",ColumnType.Money) on a SQL Server database would cause the built-in Currency column type to be used. But on a Firebird database it might use a standard DECIMAL column instead since no native Currency colum type exists. You can also encapsulate differences in syntax for default values or whatever you need to change.

3) Each statement automatcially checks to see if columns or tables already exist so you don't need to keep writing the SQL to check this. This means that the code can be run multiple times with no adverse impact. So if you run the program on an empty database it goes through and creates all the tables and columns and populates any default data values (as per code). If you run it on an existing database it runs through the code and brings the database up to date.

4) It gives the user an easy and automated way to perform dtabase updates. Instead of handing someone a set of scripts that they have to run through some SQL command line program they can just double click on the app and it does the dirty work. The error handling is centralized and the user gets good feedback on what is happening.

5) You can also run plain raw SQL through it. Anything that you can do from a standard ADO.NET Command you can do here. So instead of writing complex SQL to query values and branch if something isn't right you are just writing simple C# code. It makes fixing or setting values based on complex criteria very easy compared to writing the same thing in SQL.

I've only been using this approach for a few months but I really think I'm going to like it. It solves a lot of the old problems with maintaining SQL scripts. It is much more user and programmer friendly and removes the need to keep writing the same old nasty SQL to determine if columns/tables already exist before adding them again. I'm sure that I will run up against some issues sooner or later. But so far it's been really nice just writing C# code instead of SQL.
dood mcdoogle
Wednesday, April 04, 2007
keep in seperate files per table

write a little script to run them all to set up a new database
Mike S Send private email
Wednesday, April 04, 2007
I'd also recommend a wiki to document what the tables/stored procs do. We just had a horrible situation where one of the developers decided to delete several tables and their stored procs: "Oh we don't use those anymore!" and blowing out most of the logging which he didn't understand, work with/on and/or care about.

We use Red Gate's products, such as SQL Compare.
Wednesday, April 04, 2007
As dood and Mike have implied, it depends on the job.

If you were creating software that is ment to be sold as a package, I do really like dood's approach. This setup would allow you to prepare scripts (so to speak) for each of the major vendors with hopefully minimal hassle.

On the other hand, if you're a corporate IT shop, Mike's approach is better; the lots of mini files plus a wrapper script that calls the right files when needed, can also drastically simplify DBA tasks that don't involve your program at all.

If you have a QA department, it depends on whether they want to tear down and rebuild the entire application stack each time, or if they prefer to break things up into small black boxes that can be tested independently (dood and Mike, respectively).

So basically, if your application needs to run on multiple databases, have the application generate the SQL needed. (there are a variety of ways to do it.) If you only support one, separate the application from the database.
Wednesday, April 04, 2007
Dood has good reasons for doing it the way he has described (application based). Keep in mind that this would bloat your code if you did it in the main application. Using a separate application or dll for database updates takes care of that problem. It works pretty well to just send a customer a program and say "run this". Also remember that in the update application, you'll need to use a db user with the appropriate credentials to create the db objects.

If you store multiple scripts in source control, you'll need to manage them to indicate which ones are needed for production and in which order. Moving unused scripts from a "Production" folder in source control to a "Historical" folder is something that is easily forgotten.

I'd vote for the multiple scripts with a wrapper to roll them up. If you wanted to get fancy, you could roll the scripts into a resource file and run them from an update application - no heavy duty code needed.

I agree with the advice. Red Gate's packages save me an immense amount of time. I rarely handcode DDL anymore thanks to their tools.
Mike Saeger Send private email
Wednesday, April 04, 2007
You might want to take a peek at Visual Studio for Database Professionals. We've been evaluating it for a few months and I have to tell you that I am really excited about this product. MS still has some work to do on it, but so far its helped me out tremendously.
Wednesday, April 04, 2007
Thanks for the advice, there were some good points that I had not considered.

It sounds like the best approach is going to be to use multiple small files for each object, and try and use Red Gates Sql Compare to create the migration scripts as the schema changes.

I'm intrigued with the idea of creating a C# application to mange the database scripts. This sounds like a great idea; but the DBA's would not allow this, and it would take a fair amount of time to create.
Looking for Advice
Thursday, April 05, 2007
>I'm intrigued with the idea of creating a C# application to mange the database scripts.

Take a look at how DotNetNuke handles version upgrading. One of the guys at the office is impressed with it (I haven't had time to investigate). Apprently one drops a new script file into a directory, it detects it and runs it.
Thursday, April 05, 2007
+1 to Mike S.

Don't have much more to add, but that the one file to rule them all is likely to be very difficult to maintain during the normal mutations a living system undergoes.
Steve Hirsch Send private email
Friday, April 06, 2007
Another vote for multiple scripts.

Especially if you're entering them into source control. Easier to see all of the diffs for each database object (added, deleted, and mod'ed script files).

I used this little tool to generate DDL as well as INSERT scripts:

It also handles dependencies so that scripts are run in the right order. Before I found this handy tool, I had to maintain a text file myself that listed the order that scripts should run.

For each new rev of the database, I re-generate the scripts for the entire database and then commit that to source control.

The missing ingredient that I'd want to *eventually* have as part of my automated build is to generate migrate scripts for different releases. To do this I'd need a compare tool -- Red Gate is a good one I've used. EMS' looks good too, but haven't tried it.
Jason A Send private email
Friday, April 06, 2007
A few things I should add to my previous post.

When regen'ing scripts for the entire database for every schema rev, there are a couple of things you'd need to handle yourself when committing to source control :

* Database objects that have been re-named.
If you're creating a script file for every database object, and you've re-named an object then the name of the script file will likely have changed.

* Database objects that have been dropped.
Script files would remain in source control if all you did was commit new or mod'ed files. For this you could use a diff tool to weed out those deleted files (KDiff3 is one I use a lot).
Jason A Send private email
Friday, April 06, 2007
We have a Java web application that we host on our own machine, so this solution isn't ideal for a shrinkwrap application possibly.  We use rails migrations with ruby to do all DDL... hibernate handles the mappings while rails does the database versioning.  This has worked nciely for us.  We had a similar system in place for a .NET/SQL Server application before, however; I have to say I prefer writing add_table vs. DDL.

Saturday, April 07, 2007
My way:
 - one file with DDL for complete database
 - after each change in main DDL, I create diff file that upgrades existing databases to current level
 - diff files are sequentially numbered
 - when application starts, it checkes if there are newer diff files than last applied to database and if it finds any it execute them.

Main DDL and diff files are stored in subversion.
Wednesday, April 11, 2007

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics
Powered by FogBugz