The Design of Software (CLOSED)

A public forum for discussing the design of software, from the user interface to the code architecture. Now closed.

The "Design of Software" discussion group has been merged with the main Joel on Software discussion group.

The archives will remain online indefinitely.

DB question

You are writing a n-tier application. Suppose you want to retrieve an object A that has an associated object B to display to the user.

For example, object A would be "product" and B would be "supplier".

Do you use SQL joins or multiple round-trips to the database server to build and retrieve the complete object?

Tuesday, May 23, 2006
You would preferably use a commercial/open source data mapping layer to handle these details for you. Generally they would implement this type of thing with a join for performance reasons as each round trip to the database slows you down.
Justin Kolb Send private email
Tuesday, May 23, 2006
We generally (although there are exceptions to the rule) use a "lazy loading" approach. Why grab data until you're sure you're gonna need it. Grab the product data with one trip, and that may be all you need. Why grab the supplier data? You may not even need it. Send another request to the server when you need the supplier data.

However -- if, for a large percentage of your request, you're going to need *both*, then by all means, grab them both with one round trip. Only do this if you can demonstrate that the majority of requests will need both OR if the two round-trips (above) will put you outside your response time parameters. 

Works for us.
Tuesday, May 23, 2006
I agree with Sgt. Sausage's comments.  I write accounting software.  If, for example, the Purchase Order screen is open, I have essentially 2 SQL calls.  One to grab header information about the PO and one to grab the lines/details.  And the reason we make 2 calls is because the calls are then client side cached separately.  If any changes are made, we keep those changes in cache and commit when leaving the screen.

There are other drill downs on the screen such as vendor's address details, planned shipping schedules, current inventory QTYs, etc... that are not executed (or bounced off the server) until the screen or data is actually needed even though the idea of the data already being there to the end user seems as if they just need to click a button or change tabbed views, etc...

Worked for us too...
Tuesday, May 23, 2006
Think of database calls as a fast food restraunt counter.

"What's your order?"

"A #6 combo with a coke"



They don't always get a cup because they don't know if you'll want a drink.

Calls should be in and out...quick as possible.
D in PHX Send private email
Tuesday, May 23, 2006
For something simple, yes this is the way to go. However, you must also plan ahead for complications to come. Let’s say you want to later get the product based on some filter criteria, your database engine is very powerful at set based querying. It is really bad practice to go and iterate through record sets in middle and front end code when you could have done this on the database.

Also, if you have more than one screen presenting more or less the same data, you want to reuse this query. Rater select the extra data than write separate queries for each, you can always split it later if you really must. A lot also depends on your performance constraints. If this is a website, any additional data across the wire will slow things down, now it is better to only fetch what you need. On a 10M intranet, 1K waste will not even be noticed. Simple answer; the way you handle your data is a function of your entire design.
Marius Mans Send private email
Tuesday, May 23, 2006
"Rater select the extra data than write separate queries for each, you can always split it later if you really must."

Pulling data down too early on multi-user systems can have unintended cross-over effects.

If someone adds an order between you looking at the customer screen and the customer's order screen then you want to see the latest order included, not the cached version of the list which doesn't have it.

That's annoying when you have the customer on the phone, and they can only refer to things them as "the last order we sent you"

It's OK if you're planning to implement asynch dataset changed notifications... bucket open, worms everywhere..
Katie Lucas
Tuesday, May 23, 2006
That would be a can of worms, not a bucket wouldn't it. It is 10 at night, which means I've been working >12 hours.

Yikes. I'm writing C++ in another window.. I suspect that may not be the best quality code ever.
Katie Lucas
Tuesday, May 23, 2006
you have to look at your architecture, and use cases. 

Multiple trips to the DB (e.g "chatter") isn't bad when your web server and DB server are right next to one another on a gigabit switch.  But chatter is horrible for a client app going over a very slow WAN.  But, loading millions of unneeded records into memory all at once to avoid a little chatter can be worse.  Everything else is a grey area that you're just going to have to test out in your appliction.

I would look at your 80/20 case (to start).  Solve that.  Then look at your 90/10, and see if your solution still works. 

Don't just assume chatter is always worse than one massive load, or vice versa.  The answer is always "it depends". 

...a single test is worth a thousand expert opinions.
Another Anonymous Coward
Wednesday, May 24, 2006
I think people are analyzing this too deeply. There's nothing to suggest here that the OP wasn't retrieving data that was need straightaway, or that part of thedata was for optional display.

If you need data from two tables then joining is better -- two round trips makes sense only if you don't necessarily need one set of data right away, especially since it can change before you read it.
David Aldridge Send private email
Wednesday, May 24, 2006
"Don't just assume chatter is always worse than one massive load, or vice versa.  The answer is always "it depends".  "

I agree. You just need to test it against the worst case scenario. We always struggle with this. Especially with reporting when the amount of data in the database gets large. If you just need a small amount of data it can sometimes be quicker to do separate "keyed reads" than to join on some condition that may not be indexed. Especially when the amount of data in the database gets large.

You just need to test both ways. And be sure to load your database with 2 years worth of data. Because that's when you'll really find out how badly you've written/designed your query.
Wednesday, May 24, 2006
I feel that if you are just displaying details of a single record, just perform the TABLE JOIN and get it over with. Most databases can do it very fast and unless your DB server is already hitting max utilization, don't be afraid to do a JOIN especially if indexes can be used.

If you need to display many records (example many products with many suppliers) in a single page, you will need to pay more attention to system performance. The scheme to adopt depends on your data size, table design, fields required, DB used, etc.

Some options available
(1) Still do a table JOIN
(2) From the list of products, get all the supplier needed and use the SQL "IN" construct to retrieve the relevant data. This option is useful if there are a lot of repeated SUPPLIER or when SUPPLIER table is not properly indexed.
puffyeyes Send private email
Tuesday, June 06, 2006

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics
Powered by FogBugz