The Joel on Software Discussion Group (CLOSED)

A place to discuss Joel on Software. Now closed.

This community works best when people use their real names. Please register for a free account.

Other Groups:
Joel on Software
Business of Software
Design of Software (CLOSED)
.NET Questions (CLOSED)
TechInterview.org
CityDesk
FogBugz
Fog Creek Copilot


The Old Forum


Your hosts:
Albert D. Kallal
Li-Fan Chen
Stephen Jones

Database in memory

Has anyone used a database completely loaded into memory? Any tips on where I can read up on this?

We have a transactional system that is crawling to a halt during report generations - requiring many database queries over large data sets.

A company bigwig suggested we load our database to memory - I have not used this before but I remember hearing about it at one of the free Oracle conference before.
Hank
Thursday, February 23, 2006
 
 
PWills Send private email
Thursday, February 23, 2006
 
 
Do you use MS SQL Server? In my opinion it's one of the best databases in out of the box performance. MySQL has support for some memory caching but I don't know it it will be useful in your case. Anyway, chances are that it's an application algorithmic problem or optimization problem in the database. For one, I would avoid Interbase at any cost because it usually has some problems optimizing queries and whatnot.

Summing up, simply wanting to load a database in the memory may not be enough. :-)
Lostacular
Thursday, February 23, 2006
 
 
I missed the ending of your post. I'm kind of sleepy. Sorry.
Lostacular
Thursday, February 23, 2006
 
 
Didn't read your whole post...

The standard pattern for reports on a transactional database is this:
Have 2 databases. Your transactional database handles updates/deletes/inserts. Your second database handles all of the reports, and you replicate EVERY FEW MINUTES. This means the report database is slightly out of date, but it takes that load off of your transactional database.

I am most familiar w/ MS SQL Server, where you do this with log shipping, but Oracle and DB2 make it easy to implement this pattern as well.
PWills Send private email
Thursday, February 23, 2006
 
 
Have a look at the in-memory database "TimesTen" that Oracle bought up last year. They are getting it better integrated as a front end.

On the other hand, what have you looked at in tuning those report queries?
David Aldridge Send private email
Thursday, February 23, 2006
 
 
If you can condense the data you need to a single table or view, perhaps make a client-side dataset on another box and move back and forth through that?  It would be functionally the same as loading the whole thing into memory.
Deane Send private email
Thursday, February 23, 2006
 
 
I agree with the other posters: it is very easy to write a query that gets the correct answer v e r y  s l o w l y.

Some things to think about:
- if you are iterating through a list and running a (small?) query for each iteration, consider running one big JOIN query  and watch for changes in your "outer" column value. This is a very old technique known as merge-sort-group, and it still works fine <s>
- if you are using SQL predicates to limit your result set to the top n values or only values above a certain threshold, and these are causing your bad performance (you are profiling the queries, ja?), consider using ORDER BY instead of WHERE for those criteria and have the SQL client application do the test.

Cheers
Richard C Haven Send private email
Thursday, February 23, 2006
 
 
For queries, most modern RDBMS' are effectively in-memory databases. SQL Server, for instance, will cache as many pages as possible in memory, only hitting the storage when there are writes. This is why some incorrectly believe it is "leaking memory" as it continually gets larger (until it hits a set limit, or a high percentage of free memory).

Add more RAM to your server. With 64-bit processors, even small shops are starting to look at huge-memory servers.

Obviously this all presuming that the query is well written, and the schema is optimal.
Dennis Forbes Send private email
Thursday, February 23, 2006
 
 
Hank, "in-memory" is probably not what you're after since you have a "transactional" application. You probably need someone who knows what they are doing to re-work your database schema and queries.
Anon and anon
Thursday, February 23, 2006
 
 
If you are using Oracle (as the mention of the conference suggests, have a look at the Hotsos web site. Lots of useful information about locating inefficiencies.

One thing I'll quote from a Hotsos paper is that it isn't just physical reads that kill you, it's logical reads, where everything may be buffered in memory, but you're passing too many blocks.
George Jansen Send private email
Thursday, February 23, 2006
 
 
One of our customers bought one of these http://www.superssd.com/products/tera-ramsan/ and put the tables that a lot of their reports run off on it. Goes like sh!t off a shovel. The tables are just snapshots of current data, about 30s behind real time.

Friday, February 24, 2006
 
 
Oops - forgot my name and the fact that this was on sql server.
James Osgood Send private email
Friday, February 24, 2006
 
 
I buy into Forbes advice, above.

Keeping two databases is a good idea too but it is not a pattern per say that I have seen often and I do enterprise database design with Oracle and Oracle reports.  The pattern for big enterprise is table spaces, more indexes, raw volumes, and tons of memory as Forbes points out.

In OSS world, you can literally tune PostgreSQL to the hilt to make heavy use of RAM to overcome performance hits related to IO.

I have to wonder if shipping data off to an in-memory database system to be worked on further is worth anything?  If your reports are just simple table scans against large tables, couple 100 MB, then you are pulling all that into memory.  Plus the wire speed of moving the data across a LAN, and the memory foot print of your app.  It seems to me you would need some very beefy (memory wise) workstations.  Multiply this by the complexity of table joins.

I guess we would have to know more details... Devil is always in the details.  Keep us posted, I'm interested in your progress.
Eric (another ISV guy with his company)
Friday, February 24, 2006
 
 
You might also improve speed by making the actual data retreival happen directly on the server.  Something like a nightly batch file that runs the report and stores the result in a common place, like an HTML or PDF file on a shared drive.  That kills slowness related to network speed.
Clay Dowling Send private email
Friday, February 24, 2006
 
 
I should clarify that by recommending adding more memory, I'm speaking to the goal of an "in-memory database". My experience with poorly performing queries, however, is that they're almost always the result of poorly contrived SQL (e.g. predicates in functions), a lack of appropriate indexes, or frequent long-running transactions that continually block the query (especially a problem with insufficient indexes).

http://www.yafla.com/papers/SQL_Server_Performance/high_performance_sql_server.htm
Dennis Forbes Send private email
Friday, February 24, 2006
 
 
I agree with Dennis.  For most queries, SQL Server will supply data out of memory.  Looking at the performance logs, most were 0ms, meaning it was all out of RAM, and the only cost to the client was the network round-trip.  SQL Server *loves* lots of memory.

I also agree with PWills -- you shouldn't be running reports out of your transactional database (other than the most simplistic).  You should log-ship over to a database that is organized and possibly de-normalized to support reporting.
example Send private email
Friday, February 24, 2006
 
 
As hinted above, memory is a shared resource, especially in a multiuser database system.  It must be locked and controlled just like disk.  Oracle (and the others) have mechanisms to do this, but just because it is in RAM don't think the cost is FREE.  Concurrency plays a big role here.
smartin Send private email
Friday, February 24, 2006
 
 
Depends on what you are looking to do.  If you're only looking to optimize report generation, the suggestions in the previous posts sound like they should work well for you. 

If, however, you need to process thousands or tens of thousands of transactions per second, with repsonse times measured in milliseconds, then an in-memory db may be the answer.  FWIW, my company develops software for just such scenarios -- primarily related to securities trading, telecom switching, etc.  Drop me an email if you'd like more info.
BillT Send private email
Friday, February 24, 2006
 
 
>> The standard pattern for reports on a transactional database is this:
Have 2 databases. Your transactional database handles updates/deletes/inserts. Your second database handles all of the reports, and you replicate EVERY FEW MINUTES. This means the report database is slightly out of date, but it takes that load off of your transactional database.

Completely agree. Reporting against a transactional DB is often a bad idea.
Matthew Send private email
Wednesday, March 01, 2006
 
 

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics
 
Powered by FogBugz