The Design of Software (CLOSED)

A public forum for discussing the design of software, from the user interface to the code architecture. Now closed.

The "Design of Software" discussion group has been merged with the main Joel on Software discussion group.

The archives will remain online indefinitely.

What is meant by "Scalable database"? Scalable to what extent..

Now-a-days "Scalable" is a buzz word in RDBMS field. I want to know what actually is meant by "Scalability". To what extent and on what factors can a database be considered "Scalable".

If running on multiple OS is "Scalability" then I've seen enterprises sticking to one or two OS's for years.

If "Scalability" means reduced cost on further improvements, then I've found this as a marketing hype. After 3 or 4 years, organizations want new features to be implemented and they invest there. To perform well they invest in hardware. Till that time a new version of OS is released. They invest in OS upgrade. Soon they hear that the RDBMS itself has changed and the company pressurize on shifting to next upgrade.
K Send private email
Thursday, February 01, 2007
 
 
Scalability means that adding hardware will result in improved performance and throughput, in a reasonably linear relationship.

This may sound obvious, but it can be quite hard to maintain linear scalability without getting bottlenecked at certain load levels.
Mike S Send private email
Thursday, February 01, 2007
 
 
This will give you some idea of "scalable":

http://www.tpc.org/tpcc/results/tpcc_perf_results.asp

Basically, scalability is growth -- can we grow (upward and outward) the server platform to increase throughput. Can you keep adding to it (usually hardware -- more disk, ram, CPU) and get the thing to handle unreal, gargantuan amounts of data.

Check the site linked above for some truly *monstrously* scaled databases. My largest database (larger than most folks ever get involved with) handles on the order of one million to for million transactions daily. There are databases on the site above that handle that many transactions per *minute*. That's truly a Ginormous amount of data and throughput.

For instance, my platform of choice (MS SQL Server) you can start small on a single desktop with a few dozen transactions a day from a single user, and end up scaling up to a million transactions a minute from, say, 50,000 concurrent online users.

BTW -- here's the stats on the highest performing MS SQL Server on the above linked site:

http://www.tpc.org/results/individual_results/HP/hp_orca1tb_win64_ex.pdf

That's incredibly BigIron(tm) -- 64 high-end processors, a Terabyte of RAM and over 1,700 hard drives hanging off the thing. I'd call that "scalable" by anyone's definition.

Note that SQL Server is currently getting stomped by Oracle and DB2, in terms of both raw throughput *and* system pricing. I'd say that they are both "scaling" bigger and faster than my platform of choice.

Of course, there are physical limits. There is no such thing as "infinite scalability" -- but the limits get pushed year after year with BiggerBetterFaster hardware and new versions of the database engines.
Sgt.Sausage
Thursday, February 01, 2007
 
 
Grant Send private email
Thursday, February 01, 2007
 
 
"Note that SQL Server is currently getting stomped by Oracle and DB2, in terms of both raw throughput *and* system pricing."

Hmmm.... 60% of the top ten lists in Price/Performance are still MSSQL.  Ten years ago 95% of them were MSSQL, but that hardly means that MSSQL is getting stomped.  My favorite observation to show people is the obvius lack of MySQL in the Price/Performance list.  The submissions are mostly made by hardware vendors, so I would think that if running MySQL would put them at the top of the list, they would do it at the drop of a hat.
JSmith Send private email
Thursday, February 01, 2007
 
 
There's nothing new about database scalability, and it's far from being a buzz word. It has been the focus of the major vendors for many many years, who are dedicated to high concurrency, low locking overhead, recoverability etc.. Maybe it has become an issue in a different way because it is now unfashionable to teach/learn about databases, leading to a decline in the experience of those working with them, leading to a belief that they don't scale, leading to exciting new application tiers requiring massive hardware that people do see as fashionable.

I don't see it as being related primarily to hardware either. It's related to the number of transactions that can be supported by the architecture, or the bandwidth that it can deliver. Higher hardware specifications are just the means of allowing it to scale.
David Aldridge Send private email
Friday, February 02, 2007
 
 
==>Hmmm.... 60% of the top ten lists in Price/Performance are still MSSQL.

Didja look at the link I posted? 1 of the top ten on that page is MSSQL. I'd call that 10% not 60%.

The Price/Performance page ain't really relevant to massively scalable -- those price/performance ratios are based on performance of tens of thousands of transactions per minute. The page I linked was for the BigIron with throughput of *millions* of transactions a minute and ... yes ... MS SQL is getting thoroughly stomped in that range.

Nothing against MS SQL -- it's what I use. Just stating the fact that on the massively scaled througputs it ain't performing up to its peers.
Sgt.Sausage
Friday, February 02, 2007
 
 

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics
 
Powered by FogBugz