The Design of Software (CLOSED)

A public forum for discussing the design of software, from the user interface to the code architecture. Now closed.

The "Design of Software" discussion group has been merged with the main Joel on Software discussion group.

The archives will remain online indefinitely.

Generating SQL syntax for different databases

What are some of the techniques you guys are using to return SQL syntax based on a database (MySQL, MS SQL, etc.)?

Maybe something a la LINQ but watered down.
Fung Send private email
Friday, March 30, 2007
 
 
Not quite sure what you're asking... if you stick carefully to ANSI standard SQL, it'll work on just about any database.
Mike S Send private email
Sunday, April 01, 2007
 
 
My web application framework is based on the 3 Tier architecture which separates logic into the presentation (UI) layer, business layer and data access layer. This means that objects in the business layer must go through the data access layer in order to communicate with the database, so it is only the data access object (DAO) which constructs and executes any sql statements. Unlike other frameworks I have a single DAO for all database tables, not a separate DAO for each table. Because there are subtle differences between the databases that my framework supports (MySQL, PostreSQL and Oracle) I have a separate class file for each RDBMS, so the DAO can be instantiated from the relevant class file at run time.

Any differences between the various RDBMS engines is therefore isolated in the DAO and totally invisible to the presentation and business layers.
Tony Marston Send private email
Sunday, April 01, 2007
 
 
ADODB (PHP DB-Abstraction library) provides functions that return the SQL-strings for functions handled differently in databases. MySQL for examples uses 'CONCAT(a,b)' while postgres (and everything else?) uses 'a || b'. When constructing the query, you use
$query = 'SELECT '.$db->concat('a','b').' FROM...';
and the correct SQL will be returned based on the currently-in-use Driver.
G Jones
Sunday, April 01, 2007
 
 
I tried sticking to ANSI SQL but that turned out to be too idealistic. =/

Using functions to return the proper SQL based on the driver sounds pretty good. Although there will still be problems with stuff like in MS SQL you can use WITH (NOLOCK), but you will have to set the transaction isolation level to READ UNCOMMITTED in MySQL to get the equivalent.

Or in MySQL, DROP TABLE my_table IF EXISTS, the IF EXISTS does not have a MS SQL equivalent and will have to be a separate statement.

BTW, I forgot to mention that in my case the SQL returned will be based on a .NET DbProviderFactory type. The DbProviderFactory type is only known at runtime, so it can't be a compile time solution.

Actually, is there a design pattern for this?
Fung Send private email
Monday, April 02, 2007
 
 
You can use read uncommitted in MS SQL Server as well (from Books online):

SQL-92 defines the following isolation levels, all of which are supported by SQL Server:

Read uncommitted (the lowest level where transactions are isolated only enough to ensure that physically corrupt data is not read).

Read committed (SQL Server default level).

Repeatable read.

Serializable (the highest level, where transactions are completely isolated from one another).
anomalous
Monday, April 02, 2007
 
 
+1 to Tony

We use an approach similar to Tony Marstons. I feel it's important to emphasize that most of our applications are CRUD databases and with a little bit of forethought, it's very easy to come up with one "file" or DAO that satisfies almost all of our data access needs going forward. When the time comes to port the application to a different database, we essentially rewrite the actual SQL calls in that one file.

G. Jones' approach is very intriguing, and right now I think the only reason we don't do anything like that is because the intern would have to learn both the "source" SQL notation, the "destination" SQL notation AND the substition notation that he described. For an experienced team that works with it day in and day out, it's not an issue. It does make it hard for us to bring someone in just for a few days to translate SQL calls though.
TheDavid
Monday, April 02, 2007
 
 
An approach I've taken when building software that has to talk to multiple database platforms:

- Design the database schema as an XML document.
- Write XSLT transforms to generate the CREATE TABLE, CREATE VIEW, CREATE INDEX, etc. DDL from the schema.
- Write XSLT transforms to create the CRUD stored procedures for each table, making sure that wherever possible the names and calling parameters for the procs have ANSI SQL data types and names.
- Put no SQL in the middle tier; instead have all middle-tier code talk to the stored procs.
- Create new XSLT for each new database platform you need to support.

It's a lot easier to do this when you're building a DB app from scratch, obviously. 

Depending on your app, you may also be able to transform the schema into the source code of abstract base classes for all objects that persist as a single row in the database.  This is very handy from a DRY perspective, as your middle-tier validation code, your tables, and your stored procedures are now all generated from the same single set of data elements, so that if you change the data type or length of a column during development you only have to fix it in one place.
Robert Rossney Send private email
Monday, April 02, 2007
 
 
For me the hardest part has always been handling SELECT statments. INSERT, UPDATE, and DELETE is easy because there is typically a one to one mapping against business objects so you can autogenerate these statements if you can map properties of objects to columns/keys. But with SELECT statements you really can't autogenerate something that is so dynamic. For example, the WHERE clause for a given report usually isn't known until runtime. You can also have complex joins that simply don't lend themselves to autogeneration.

So for the typical CRUD app you can easily automatically generate the INSERT, UPDATE, and DELETE statements. But SELECT statements usually need to be done by hand unless you intend to spend a LOT of time trying to come up with some cross-DB capable query language like LINQ or NHibernate. And I just don't think that is worth all the effort.
dood mcdoogle
Monday, April 02, 2007
 
 
Re: Select statements.

Admittingly, select statements are hard to deal with unless you're very well versed with the problem domain and you know what type of calls are likely. For example, if you were writing a real estate application and had some experience with the business, you can probably come up with a top ten list of most commonly called functions.

Off the top of my head: all properties, properties within a given price range, properties within a certain zip code, properties with X number of bathrooms and Y number of bedrooms. Also, for the boss, number of properties available, number of properties added, number of properties sold.

At which point, you realize that some of these only differ in terms of arguments provided. So you could write a wrapper function such as getProperties(). If no arguments, assume all. If you get zip=00001 as an argument, assume its by zip code, and so on.

In your DAO, you dynamically determine the where clause based on the arguments provided. (You would essentially need to do the same basic idea if you used stored procedures.)

That said, I do admit real estate is a fairly simple and well understood model. I would not have a clue how to deal with Derivatives Pricing for example.
TheDavid
Monday, April 02, 2007
 
 
< For me the hardest part has always been handling SELECT statments. With SELECT statements you really can't autogenerate something that is so dynamic. >

Why not? My abstract database class has a single getData($where) method which handles all SELECT statements on all tables. The contents of the $where argument is generated automatically by the framework. For example, a LIST screen shows a series of records, and the user selects one or more of them by ticking a checkbox on each row. He then presses a navigation button to activate another form, so the framework constructs a WHERE clause using the primary key fields of the selected records, then passes this as a string to the selected form. When the new form appears it uses the passed string in its getData() call and automatically shows the first selected record, and has controls if there are any others.

This is not rocket science, so what makes you think it is so difficult?
Tony Marston Send private email
Tuesday, April 03, 2007
 
 
"The contents of the $where argument is generated automatically by the framework. "

So how would a framework automatically generate where clauses of the following nature?

WHERE CUSTOMER.LAST_NAME LIKE(?) AND CUSTOMER.ADDED_DATE < ? AND CUSTOMER_PURCHASES.LAST_PURCHASE_DATE > ?

How about joining tables along with the associated where clauses? Performing summations on columns? Calling built in functions like UPPERCASE()?

I agree that doing a keyed read on a single row is trivial. But is that really the only types of SELECT statements that you do? Don't you ever receive criteria data from the user or join multiple tables together into a composite resultset? Isn't the number of where clause parameters ever dynamic based on how a user fills in a form?

These are the things that I think are difficult to capture in a framework. NHibernate has its own query language that attempts to support these situations but at the expense of writing an entire new language along with the equivalent contructs.
dood mcdoogle
Tuesday, April 03, 2007
 
 
Here's another example: we have two retailers using the same totals report. One retailer wants to include the sale of Gift Cards in the net sales amount. The other doesn't. We control this through an application level configuration setting called "Include Gift Cards in Net Sales". So the where clause is dynamic based on whether or not this business level configuration entry is set. Note that the data model and user interfaces have not changed at all. How do you account for these types of things in your framework automatically without having to just manually code it in?
dood mcdoogle
Tuesday, April 03, 2007
 
 
One more comment directed at Tony, if your framework contains a single getData($where) method, does that mean that you are always passing the where clause in as a string value? If so, doesn't that open you up to SQL injection attacks?

In order to have a single method in a framework that does all reading I would assume that you would need a where clause and array/list of parameter data values at the very least.
dood mcdoogle
Tuesday, April 03, 2007
 
 
dood mcdoogle: So how would a framework automatically generate where clauses of the following nature?

WHERE CUSTOMER.LAST_NAME LIKE(?) AND CUSTOMER.ADDED_DATE < ? AND CUSTOMER_PURCHASES.LAST_PURCHASE_DATE > ?

My framework employs a series of search screens which allow the user to enter values for various field names. By pressing the SUBMIT button these values are converted into a WHERE string and passed back to the previous screen where that sting is processed.
Tony Marston Send private email
Wednesday, April 04, 2007
 
 
dood mcdoogle: How about joining tables along with the associated where clauses?

All related tables are identified in the data dictionary, and this information is made available to each table class. This information is then used to add JOIN clauses into the sql SELECT statement.

dood mcdoogle: Performing summations on columns? Calling built in functions like UPPERCASE()?

It is possible to customise the SELECT statement in whatever way you desire, so you can perform whatever summations you want, or call whatever funtions you want. But for 95% of cases no such customisation is required, so whatever the framework generates is good enough.
Tony Marston Send private email
Wednesday, April 04, 2007
 
 
dood mcdoogle: One more comment directed at Tony, if your framework contains a single getData($where) method, does that mean that you are always passing the where clause in as a string value? If so, doesn't that open you up to SQL injection attacks?

Yes, the where clause is passed as a string value, but NEVER in the URL. It is passed around in the session data which is maintained on the server.

dood mcdoogle: In order to have a single method in a framework that does all reading I would assume that you would need a where clause and array/list of parameter data values at the very least.

Each database table class includes a structure file from the data dictionary which identifies all the fields which exist on that database table, their specifications (type, size, etc), which are primary or candidate keys, and all relationships. This information is then used in the construction of the SELECT statement. If the default SELECT statement needs to be customised then it can be. The SELECT statement is made up of a series of different clauses, so it is easy to instruct the framework to use a specific cusomised string for any of these clauses instead of having the framework generate one for you.

I have been using this technique to build complex SELECT statements in my web applications for several years, so I know hat it works.
Tony Marston Send private email
Wednesday, April 04, 2007
 
 
Thanks for the explanation Tony.
dood mcdoogle
Wednesday, April 04, 2007
 
 
In the first version of the data access layer I wrote (a mid level OR Mapper) I had IDataAdapter interface with different implementations for each provider (MSSQL, MYSQL etc), behind each there tended me a CommandBuilder for each provider.

This has the benefits that you can provide some truly diverse provider support (Xml files), but can lead to a lot of repetitive code.

In the second version I added a rather powerful query system to query over the object model and use the object relationships. I found that here it made a lot more sense for the Query object to be able to compile itself to SQL rather than duplicating massive amounts of code for each provider. At this stage I had decided to stick with SQL providers having to decompile to something like XPath became a moot point.

I created a IDataContext interface the provided all platform specific information, such as a IDbConnection was created as SqlConnection if the provider was MSSQL. For the actual SQL itself it mainly returned a collection of format strings, for instance the ParameterFormat was "@{0}" for MSSQL and "?" for Access. This allowed one CommandBuilder for the entire project and allowed the Query object to compile itself to complex SQL statements.

I think both methods are valid and really depends on how large you expect the variances to be.

Hope this helps
Nigel Sampson Send private email
Monday, April 09, 2007
 
 

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics
 
Powered by FogBugz