The Design of Software (CLOSED)

A public forum for discussing the design of software, from the user interface to the code architecture. Now closed.

The "Design of Software" discussion group has been merged with the main Joel on Software discussion group.

The archives will remain online indefinitely.

database question

here's the situation:
I have a Small business application
The setup is vb6 front end to jet database backend using DAO,
Unlikely to be ever more than 10 computers connected concurrently.
Database file is on a separate server computer that does nothing but store this file.
Connections are kept open from moment user starts vb program to moment closes program as this seems to save (a lot of) time making connections for each query.
Database contains personal and sales details of about 5,000 clients.
Can expect this to rise towards 10 - 15000 over next 5 to 10 years (reasonable expectation)
Number of sales per client averages about 40 and will gradually rise at rate of about 8 per year (guessing).
So sales table if it was a single table would have hundreds of thousands of records.
To avoid this 100 sales tables were created, sales00, sales01, ...sales99.
Clients whose primary key ends in 05 have their sales info in table sales05 etc.

The SQL for the queries is built using variables to insert the correct table number.
All the SQL is in the vb program, there aren't any stored procedures (actually I don't know if that is even possible in jet)

Indexes on fields that are in 'where' clauses.

database is compacted every day when first user runs program

Business-wide stats need to be accessed less often and are got from a single temp sales table created from using select/insert into from the other sales tables.

The idea was obviously to keep the sales tables small as reading I had done, and dummy runs I made, suggested that access/jet tables performed well with <20,000 records. That query time would grow linearly up to this and then exponentially afterwards.

The Problem:
1: Query time (after 5 years) is beginning to lengthen from instant to the level where users are commenting on it. (Sales tables average 2000 records each)
2:  Database compaction time is beginning to take a bit too long (file is about 60MB).

I am thinking of moving to SQL server express. I can fairly simply convert the DAO statements to ADO I think.
However the reading I've done on the web suggests that the performance might even be worse in SQL server.
I don't need the scalability in terms of numbers of concurrent users.
I'm not shifting huge amounts of data over the network because the tables are still small
and a lot of people claim that stored procedures don't improve performance all that much.

What should I do to improve performance?
dog breath
Friday, August 01, 2008
"What should I do to improve performance? "

The first thing you should do is learn a whole lot about the theory and practice of database design and development.
One of the things you will learn is that a premature focus on performance will generally make the design unusable,  while providing minimal performance benefits at best.

A simple and sound logical design, followed by reasonable tuning of the physical database for performance, when necessary,  will carry you a lot further, except in unusual circumstances.

I don't know the jet engine myself.  But for every DBMS engine that I do know,  dividing the sales table into 100 smaller tables would be about as counterproductive as it gets.  For any DBMS engine of industrial strenth,  hundreds of thousands of rows is still small potatoes. 

I will leave it up to people who know both the jet engine and databases in general to answer the question of whether the jet engine is up to the job.
Walter Mitty Send private email
Friday, August 01, 2008
> However the reading I've done on the web suggests that the performance might even be worse in SQL server.

I find that surprising. Would you hyperlink to some of what you've read, for me?
Christopher Wells Send private email
Friday, August 01, 2008
>The first thing you should do is learn a whole lot about >the theory and practice of database design and >development.
>One of the things you will learn is that a premature >focus on performance will generally make the design >unusable,  while providing minimal performance benefits >at best.

I'm sure you're right but it got me out of a hole at the time. I did read up a fair bit about normalising and indexing at the time but I couldn't get it to work really fast when working on large amounts of data.
(And in fairness it's worked well for 5 years so it's not totally unusable.)

> I find that surprising. Would you hyperlink to some of what you've read, for me?
dog breath
Friday, August 01, 2008

The OP found his test results surprising, and I do too.

Still, the people replying to him there are saying that the Access [sic] database is performing well because it's on the same machine as the application ... which isn't your use case (you have "10 computers connected").

I'm also suspicious of his results (i.e. 60 seconds to do 200 SELECTs): on my machine, in my own code (C#), with a local SqlExpress instance (but not so many records), a select-and-then-read-resultset takes 0 to 2 msec.

I suggest you write your own test case to bench-mark the performance of your own use-case.
Christopher Wells Send private email
Friday, August 01, 2008
>Still, the people replying to him there are saying that >the Access [sic] database is performing well because it's >on the same machine as the application

Yes but he denied this saying :
"But the comparison was with databases both on the same network server.."

>I suggest you write your own test case to bench-mark the >performance of your own use-case.

Yes, I guess that's what I will do. Oh joy!
dog breath
Friday, August 01, 2008
In the long run, moving to SQL Server Express, and having a proper design will improve performance more. Having a ton of partial tables really isn't the best of designs.

In the short run, however, it sounds more like there's an error somewhere. 2000 records isn't really a large amount, and it should not be noticeable yet.

Are you requesting more data from the server than you need? Give your SQL queries a check, and make sure they are retrieving only what is needed, and not extra data.
Rohan Verghese Send private email
Friday, August 01, 2008
I don't know sql server or the jet engine (haven't used it in 10 years), however, you might want to check on your hardware since you mentioned your application is 5 years old.  Things like storage can degrade over time.

Like the others, I was surprised to hear about the poor database performance given the small volumes you are working with.

Benchmark and monitor your system components.  Having data points over time will help you see which components may be causing the application slowdown.
Chi Hoang
Friday, August 01, 2008
Jet, when last I used it, behaved very poorly when multiple users queried it.  This shouldn't be surprising:  as a client-side database, multi-user access has to be controlled through the file system, which is awfully inefficient.

When I was doing Access development, it was not uncommon to have a local database for users that we'd copy tables into for querying, just to boost performance.  You might consider that as a short-term fix, if your users have read-only access to data and it isn't being updated in real time.

Moving to a server-side database will mitigate the multi-user problems that Access poses.  You should see great performance increases.  Note that if you're going to benchmark the two approaches (I actually wouldn't bother, because I know the answer already), you have to make your tests multi-user in order for them to be remotely representative.
Robert Rossney Send private email
Friday, August 01, 2008
For gods sake, stop using JET. Any one of the free/express true database engines will cause fewer headaches for the rest of your life.

Start with MS-SQLServer Express if you are in an MS environment.
IT Guy
Friday, August 01, 2008
You don't mention if you are using a server operating system. If you are using a plain MS-XP Home or MS-XP Pro then you are limited to 10 connections. If you have other shares open then the share connects could be flipping on and off to accomodate more than 10 connections.
IT Guy
Friday, August 01, 2008
Thanks for the advice, everyone.

The hardware is probably OK but I will look into it.
I can shift a 50MB file across the network in about 6 seconds which doesn't sound too bad.

For the very short term I'm going to concentrate on optimising the (relatively few but heavily used) slower queries.

The definite medium term goal is to leave Jet and move to sql server express.

Problem is that I know next to nothing about it as of now, so I need to read up a bit first.

Longer term may redesign the database a bit, lol.
dog breath
Friday, August 01, 2008
Access + network + 10 clients = disaster.

Immediate goal #1 is to move this to a SQL Server instance. This should be done before any attempts to fix or optimize the current situation. Once you are on a stable platform then you can analyze performance and optimize queries all day long.

Also, proper database design goes a long way towards optimizing database performance. Once the schema has been optimized you can index the tables properly, and then you should be able to run queries against millions of rows with no discernible performance hit.
The Original Henry
Saturday, August 02, 2008
"The setup is vb6 front end to jet database backend using DAO"

Now that brings back happy memories of the 1990s :)

As others have said, while the Jet database will be super-fast on your own computer, it will slow down significantly over the network, and if many people are using it, performance gets progressively worse.

Check out SQL Server 2005 Express. It's completely free! I guarantee you that it will be a lot faster in a networked multi-user environment. 2000 records is a very tiny table. SQL Server can easily handle million-record tables. Also, buy the developer edition of Sql Server 2005 for only $50 in order to get the full-blown management studio.
Grumpy .NET programmer
Sunday, August 03, 2008
It's not even clear to me that the DAO interface gives the Jet engine the information needed to optimize the query.

The OP might even be paying the freight for 200,000 tiny queries in order to access 200,000 rows.  In such an environment 100 tiny sales tables might indeed outperform one big sales table.

As far as pulling only the relevant subset from the sales table,  without a lot of extra reading,  well, that's what indexes are for.

For those of us accustomed to expressing our query in SQL,  and letting the optimizer do its job,  the mindset of the OP might be completely alien to us.  When he's ready,  he'll learn what he needs to learn.

Good luck, Dog Breath!
Walter Mitty Send private email
Sunday, August 03, 2008
As per everyone's recommendation I plan to move to sql server express as soon as possible.
The steps seem fairly straightforward:
1: replace DAO code with ADO in my vb6 code.
2: Instal1 sql server express
3: use SSMA to import the database into SQL server
4: badda bing!

Only it's a bit more complicated than that, I'm sure.
dog breath
Sunday, August 03, 2008
I would check the box you are storing/running that db from. Check the performance of that box and the MDAC installed. That may be the biggest bottleneck. Ive used Access like this years ago and it ran pretty well on fast boxes. As you know, the other bottlenecks are network or switch bandwidth, your queries (are you filtering all your joins so they are as small as possible, caching or poolig of connections (after 10, allot of calls could be pooled or waiting), and your app thats processing these queries (what biz logic is it it massaging the data in conditional loops or long sorts, etc). Look into all those items.
Ranger Send private email
Sunday, August 03, 2008
"The steps seem fairly straightforward:"

You're definately on the right path; sure, it takes some time and you have to learn some new stuff, but it'll be worth it. Just as a reference point, I maintain one system where we have SQL Server running on a dual CPU box with nothing fancy (i.e. it's not a "super-duper" server, just a regular Dell box). The main database has data from an invoicing system, where invoice rows are used for just about every financial report.

There are currently over 30 million records in the invoice row table, and there are 20+ users using the system at all times (peaks at over 50 users). The system is not very fast, especially during end-of-month reporting frenzy, but it's perfectly usable. Queries like "how many items of each category did I sell last month" (sales guys get bonus based on this type of information, so they keep running these reports all the time, over and over ;-) take only a few seconds.

Some effort has been put into optimizing the report queries by hand, setting up views and some stored procedures, and indexing. All in all I can say that SQL Server (the Express Edition is pretty much the same engine) is a very capable server.
El Dorko
Monday, August 04, 2008
Christopher mentioned: "60 seconds to do 200 SELECTs"
If you are doing a select and then soing another select based on each of the results of the first you will kill performance, regardless of the database you are using.  For example: if your initial result is 200 records then you will be creating 201 select statements against your database.
Look for the spots with performance issues and address these before changing database.
Thursday, August 07, 2008
Before this thread becomes locked I'd just like to inform all those kind enough to post comments that I'm well into the conversion process from jet to SQL derver express. So far it's looking good. I'll post a comprehensive report in a new thread later.
dog breath
Wednesday, August 27, 2008

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics
Powered by FogBugz