The Design of Software (CLOSED)

A public forum for discussing the design of software, from the user interface to the code architecture. Now closed.

The "Design of Software" discussion group has been merged with the main Joel on Software discussion group.

The archives will remain online indefinitely.

SQL Server 2005 vs. MySQL

We are wondering whether to use SQL Server 2005 or MySQL as our database. The database will store the following information (for stocks & bonds):

1. Market activity information – price, volume, open, high, low, close for various instruments.
2. Fundamental information – balance sheet line items, income statement line items, news stories etc.
3. Technical indicator values – commonly used moving averages, RSIs etc.
 
Typical use scenario will be that a browser will request the following information:

1. Data for an stock/bond
2. Data for a certain day/time (across a bunch of stocks).
 

The predominant concern here is : SPEED. The application should very quickly be able to handle requests and spit out the necessary data to a .NET client.

 

Do you have any idea what might be the fastest solution in this situation? Can you ask your coders for any suggestions?

I am tempted to go with MySQL since it is free vs. high licensing costs for SQL Server.

The main reason I am considering SQL Server is because of a notion in my head that if it's a .NET/Windows application, SQL Server may be faster.

Thanks!
Shiny Happy Send private email
Saturday, February 16, 2008
 
 
SQL Server hands down.  the MySQL .Net connector is not up to speed in my opinion and doesn't handle bulk loading as of yet (belive that is in an alpha version)
MySQL is for Girls....
Saturday, February 16, 2008
 
 
I would suggest SQL Server 2005 as it is very integrated with .NET technology and development/debugging would be a breeze. But that said database speed is not a simple matter of choosing the database. Depending how "enterprisy" your application is you have to think from physical infrastructure (clustering, SAN) to Database performance tuning (indexes, views, db file storage, etc) to schema design (tables, normalization,constraints, stored procs) etc. If possible a development DBA would be a great team asset as in my experience good programmers are generally bad on sql side.
If your app gets constant real time updates the I assume there will be lots of inserts or updates and MySql is I heard fast for selects only.
nullptr
Saturday, February 16, 2008
 
 
It is not very enterprisey at all. Mostly people will just be viewing data. There won't be any additions to it in real-time at least in the beginning.

We are a startup, so what i worry about is the cost of ownership for SQL Server. Do you have any idea on that? The Microsoft pricing page gives out some pretty scary numbers cost wise, for a startup.
Shiny Happy Send private email
Saturday, February 16, 2008
 
 
If your app is mostly CRUD like then investigate a DB neutral DAL (ORM or even home grown based on .NET provider model) and then compare with both SQL Server 2005 and MySql.SQL Server 2005 express edition is free up to 4GB DB size. I don't know about SQL Server 2005 licensing costs but a DB guru I know said that SQL Server 2005 standard edition on X64 server is a good small business solution. The X64 server gives a huge advantage over 32-bit servers as SQL Server can use more memory in its caches. I have no experience with MySql but .NET, Visual studio and SQL Server 2005 are very tightly integrated from development point of view.
nullptr
Saturday, February 16, 2008
 
 
"I am tempted to go with MySQL since it is free vs. high licensing costs for SQL Server."

It may or may not be free for your specific use. If you intend to use it inhouse for personal use only then it is possibly free. Otherwise, you need to buy licenses and they aren't cheap.

Please read the fine print if you haven't already just to be sure.
uggh
Saturday, February 16, 2008
 
 
Shouldn't this be a showdown between SQL Server 2005 and PostgreSQL (which is truly free, btw)?

MySQL doesn't necessarily stack up to SQL Server/Oracle/Sybase/DB2-level products (yet, I guess give it some credit for the advances it's made lately).

PostgreSQL is closer.

The admin tools for PostgreSQL suck, though.

If using Visual Studio it's SQL Server hands down.  It's just too easy  to use it with those development tools (duh).

If you only need a small server, you can use the Express Edition.

IBM DB2 Express-C is also available, and has integration with Visual Studio (rather deep integration also).  It is like the SQL Server Express Edition equivalent of DB2, but without the Database size limits (making it a more than excellent alternative).  It's more resource hungry on the development machine, though.

It's probably in second place IRT making things easy when working with Visual Studio.

Another choice that's often forgotten is Firebird - Very small, lean, fast, and easy to deploy.  Also truly free for commercial use AFAIK.
Nate Send private email
Saturday, February 16, 2008
 
 
SQL Server. You could write a small app to test the performance on your own server, comparing MySQL to MSDE. This is an investment of one or two days, even if you test with more databaseservers and/or different setups.

You might also want to drop a note in the SQL Server forums on MSDN. There are lots of settings that affect the performance.
Eddy Vluggen Send private email
Saturday, February 16, 2008
 
 
If you need to be able to export anything, then SQL Server hands-down.  I was about to evaluate MySQL for my work, but I couldn't find any easy, hassle-free way of exporting tables to Excel or flat files, which I routinely need to do.  Same with Postgres - nice for basic needs, but I prefer to have everything integrated to make it easier.
WayneM. Send private email
Saturday, February 16, 2008
 
 
"If you need to be able to export anything, then SQL Server hands-down.  I was about to evaluate MySQL for my work, but I couldn't find any easy, hassle-free way of exporting tables to Excel or flat files, which I routinely need to do."

use sqlyog.  http://www.sqlyog.com/.  You can pretty much export any table you want to a CSV file, which can be easily imported to excel.

also,

Postgres might have more features than mysql, but mysql has much better community support (which is why I don't use Postgres).
justin Silverton Send private email
Saturday, February 16, 2008
 
 
We are running both Postgres and SqlServer 2005.
Both are doing its job fine.

While SqlServer might have some advantages administration wise, it is a waste of money from my POV. At least in our case, where the database is just a dumb data container.

I'd rather spend those license fees on hardware.
Fat Albert
Sunday, February 17, 2008
 
 
Here's the thing though - i'm not doing anything too complicated - just using the database as a dumb data container. So that's why i'm so tempted to go with MySQL. I worry that all the features in SQL Server (which i most likely won't need) will mean a performance hit, and therefore i might be better off going with a lighter product.
Shiny Happy Send private email
Sunday, February 17, 2008
 
 
Given you said you're only looking for a dumb data container, then I think you'd be better off with MySQL (or Postgres, depending).  Why spend $5,000/processor if you don't need to? 

You might need to do some kind of performance test to see how fast the access is - I'm sure that SQL Server will be faster, but the speed might not be that great.
WayneM. Send private email
Sunday, February 17, 2008
 
 
>> Why spend $5,000/processor if you don't need to?  <<

MySQL is still not free, for commercial uses.  But it'll be a lot cheaper than SQL Server, that's for sure.
xampl Send private email
Sunday, February 17, 2008
 
 
Another plus for MySql is that it lets you choose a storage engine based on your needs. ISAM should give you super fast read only access, if you don't need transactions.

Falcon is another one to look out for in the near future.

Another thing to consider is that the whole infrastructure is cheaper when using MySql or Postgres. If you need another server, just install another Linux server without worrying about licenses.

As I said above, we are using Postgres and it runs like a charm. I've used MySql before and never had any troubles with it either. I've played around with the MySql connector and it seemed to be working fine.

I personally don't see anything in SqlServer that would be worth spending those license fees on.

But as always, depending on your needs YMMV.
Fat Albert
Sunday, February 17, 2008
 
 
Of course MySQL. You will have much more flexibility when moving database, utilizing a backup solution, scaling hardware etc. It is not just the one-time license fee. When you business grows you will feel much pain from the vendor lock-in and license restrictions.

I do not think that one has to purchase a license for running MySQL. MySQL is GPL licensed, and its license does not prohibit commercial use. As long as the application does not statically link to the Mysql, there is no licensing issue at all.
Glitch
Sunday, February 17, 2008
 
 
I believe you need to be licensed in order to use the super-optimized binary produced by the MySQL company.  You get support too.

Last I heard, they compile it with Intel's ICC compiler and use profile-directed optimization to get something like 20% faster performance than what most people get compiling it from source with GCC.
Jonathan Briggs
Sunday, February 17, 2008
 
 
According to the MySQL website: "OEMs, ISVs, and VARs who distribute MySQL with their products, and are not licensing and distributing their source code under the GPL, need to purchase a commercial license of MySQL."

We are more like a website, i.e. our application doesn't distribute MySQL, so we should be OK?
Shiny Happy Send private email
Sunday, February 17, 2008
 
 
It doesn't matter if you "distribute" MySQL. Simply requiring someone else to download and install MySQL to use your product qualifies for the need to purchase a license. If it is completely optional for them to use MySQL or not then you are probably in the clear. But I am no lawyer. That's why I just steer clear of MySQL. For a lot of organizations, you can waste more time/money trying to figure out their cryptic licenses than you spend just buying SQL Server licenses in the first place.

By the way, would the free Express edition of SQL Server meet your needs? We've used it quite a bit on web sites and it has worked just fine. But these web sites aren't particularly high volume either.
dude
Monday, February 18, 2008
 
 
I think websites should be clear of licensing fees, since that would clearly not require anyone to download & install MySQL.

The Linux guys are so lucky! There are so many interesting low-cost options to consider. I'd love to consider BerkeleyDB etc, but worry about .NET integration hassles.
Shiny Happy Send private email
Tuesday, February 19, 2008
 
 
Oracle is the obvious choice if you want speed. They offer a free version but I don't know wether it's free for business use as well.
MySQL does not really compare but why not try if it's speed is acceptable? SQL Server also has a free version, I suggest downloading all of these, and maybe more, and trying them to see if performance is really a problem. Then you can decide if it makes sense to buy a license.
ByTheWay, MySQL for a webserver requires no license!

Tuesday, February 19, 2008
 
 
>Oracle is the obvious choice if you want speed.

Links?
Anonymous Hippopotamus
Tuesday, February 19, 2008
 
 
I'll reiterate it again.

Try SQL Server 2005 Express.  It's free, and should handle what you need it to do.

Why not try both, and run some tests?  I'd vote for SQL Server, just because that's what I work with professionally.

For my personal websites, I use MySQL.  Why?  It's already on my hosts servers, and they are linux based.  *shrug*
Eric D. Burdo Send private email
Tuesday, February 19, 2008
 
 
Oracle Express is free, but it is restricted to 4GB and one processor.

The problem is, Oracle wants serious money for a "processor license", if you can't license by named users. Which means you pay big bucks if you use it as a web backend.

If you treat the DB like a dumb data container, I don't think that Oracle offers too many advantages over MySql or Postgres .
Fat Albert
Tuesday, February 19, 2008
 
 
If your major concern is speed.  You need to define a representative test and run both side by side.  Anything else is just a bunch of talk like all the previous posts.

You can always buy up from Mysql.  If it is a dumb container, the SQL should be pretty similar and export/import shouldn't be difficult.
Lance Hampton Send private email
Tuesday, February 19, 2008
 
 
When it comes down to speed, it's all going to be about indexing so you can retrieve the data.

There is nothing dumb about a data container doing that for you when you are talking about even as small as 100s of MB or getting into a single table with multi-GB size.

You haven't said much about your system architecture.
Cade Roux Send private email
Tuesday, February 19, 2008
 
 
+1 to Nate.

IBM DB2 Express-C is free, and will scale.
MySQL has problems with multiple processors.

Having said that, if you develop and keep the persistence layer separate through DAOs or similar technology, then switching DBMSs will be a lot less painful down the road.
Knight Who Says Ni
Tuesday, February 19, 2008
 
 
A couple of years ago I wrote a simple benchmark to run on WinXP.

Running 5 million updates gave between 800 and 1,800 transactions per second for SQL Server.

MySQL gave around 15 transactions per second on the same machine for the same benchmark! That is not a typo.

I do not pretend that the benchmark was sophisticated or all-encompassing. It sufficed for my purposes at the time. But the difference in results was staggering.
Craig Welch Send private email
Tuesday, February 19, 2008
 
 
So staggering that there's probably something wrong with the benchmark.  I'd love to be able to believe that SQL Server (which I use) is intrinsically 100x faster than MySQL, but I suspect it really isn't.

From all of the posts that the OP has made here, it's still not clear that the project requires a database at all, let alone that we have enough information to suggest which one is appropriate.  One could meet all of the described requirements by building a memory-resident object model, which would also be fast and free. 

I think that's probably not actually the right answer, but that's because I think that there are a lot of other requirements that we haven't been told about yet.  Knowing those other requirements would help in making useful suggestions.
Robert Rossney Send private email
Thursday, February 21, 2008
 
 
If you are tracking stocks, closing prices, balance sheet data and other corporate information then you must be building a very useful website.

Do not try to penny pinch for the sake of saving a few pennies this week and waste dollars in the years to come. The above quoted benchmark for MySQL is not surprising if they were doing table updates/inserts. AFAIK MySQL locks the table when updating it.

We did a MySQL port of our software but it wasn't practical to roll it out as a packaged solution. We're evaluating Postgres instead.

You want a database that works, you want a database that won't corrupt, or require a lot of hand-holding. Earlier poster recommended the MySQL Falcon engine - a few weeks ago a wiki system I installed on Mysql recommended the Falcon engine because quote it doesn't corrupt as much as ISAM.

It's not hard, you just need to plan your way through it. Design an architecture that grows with your business. Single server now (try fit your db into a SQL Server, Oracle or DB2 express edition license), then move to a full license when you can afford it/need it, then scale up to clustered systems when your business is successful and getting millions of requests per day.

Do you already have expertise in one system or can you hire or train employees in your location with the skills to develop/administer the chosen database and environment?
 
YMMV but this is one company's experience  http://developers.slashdot.org/comments.pl?sid=427378&threshold=1&commentsort=0&mode=thread&cid=22154984

An alternative is to host your website at a provider that will charge you a monthly fee for a big-name database engine. Over five years you may pay more but you can start with something that matches your cash flow and pricing structure.
jz Send private email
Thursday, February 21, 2008
 
 
The existing respondants have pointed it out in pieces, but I want to emphasize that Microsoft has done a pretty smart pricing model for Microsoft SQL Server - you can start with SQL Express which is free - free to develop, free to use, free to distribute.  If you want to distribute it there is an agreement you need to go through, but that still is free.  Here's the big part:  As you grow, if you find you need to go to standard edition for higher performance, you can do so without changing a line of code.  When looking at licensing fees, remember that Microsoft licenses sockets, not cores.  If you look at a modern server, if you don't invest in a SAN or other very high performance disk you're likely to run out of disk before you run out of what a four core processor can do. 

The cost of a single processor license will not be a significant investment in your business if it is scaling to the point that you can't get the performance you want out of Express.  Your cost in hardware, marketing, and most importantly staff will overwhelm it. 

I use MySQL for a few things, but wherever possible I use SQL.  In my past positions I've used pretty much every edition of it, and the management tools, performance, and reliability have been a great rock to build my businesses on.
Kendall Miller Send private email
Friday, February 22, 2008
 
 
"The cost of a single processor license will not be a significant investment in your business if it is scaling to the point that you can't get the performance you want out of Express.  "

Agreed. There are too many people out there bending over dollars to pick up pennies in the IT world. If you need to buy the big Enterprise license of SQL Server then the cost is probably just a drop in the bucket compared to your other operating costs. And the cost of SQL Server Standard Edition is much less than you would probably spend in time on this forum trying to figure out if MySQL is actually free for your use or good enough to use in the first place. Time is money. And a few thousand dollars is nothing to most businesses.

Now if you are running your uISV out of your basement then a few thousand dollars may be a big deal to you. But for those situations SQL Server Express is probably all you need anyway.

Stop believing that a couple thousand dollars is a lot of money. That's just few days salary. If you end up spending even one additional week because you chose MySQL over SQL Server because it was "free" then you just made a huge mistake.
uggh
Friday, February 22, 2008
 
 
Like the original poster, I too have spent the last three months trying to figure out which RDBMS is "best" to use for capturing real-time market data, however, unlike the original poster I'm using it only for personal "production" use and not public.  As far as bang for the buck goes, its hard to beat getting a SQL Server Developer License for $49 from Microsoft, which gives you the right to use their enterprise version on an unlimited number of machines for "development" use only, although I am loosely interpreting that to mean for personal use as well.  I played around with MySQL and Oracle extensively as well, and actually liked Oracle 11g the best once it was up and running, but have decided against using it since 3/4 of your time is spent just administering the damn thing (although probably less once you get past the steep learning curve).  If you do decide to go with MySQL and realtime performance is important, I would suggest putting all of the current day's data into a mysql MEMORY table (which are very fast compared to disk based tables), and then persist them to a mysql ARCHIVE table after the market closes.  There are also two mysql 3rd party storage engines (infobright and solidDB) that are probably better for archiving large amounts of market data in realtime than either MyISAM or InnoDB.
Jeff Baker Send private email
Friday, February 22, 2008
 
 
Why not go with hosted SQL Server 2005 in combination with hosting you website (assuming it is asp.net). Use Express for local developemtn and deploy to the hosted environment.

There is many providers out there that supports SQL SERver 2005, including i a clustered mode with daily backup, that you can download to keep or use with a local express version when testing new releases in a local environment.

Just my 5 cents.
ThMoJe
Monday, February 25, 2008
 
 
If all you concern about is query read speed, I don't think the databases matter much.  With proper index, either one would be blazingly fast.

Since it's a web app, there are other consideration, such as scalability, high availability, and online backup.  For scalability to handle high volume of users, MySQL might win out.  MySQL replication is very simple and works quite well in scaling out database read load.  You can have one master DB just receiving the updates and replicating the changes to N slaves, which would serve the read queries from the user.  That way you can distribute the read load across N MySQL DB.  I don't know how easy and how cheaply you can use N SQL Server to do the same thing.  Lots of high volume websites use MySQL for this reason.

For HA, there are pretty good (and cheap) solution for MySQL on Linux.  I am not sure about Windows HA's capability.
been there done that
Thursday, February 28, 2008
 
 
Actually, I can provide some insight into this. A few years ago, I developed a stock market historical data analysis application, and I faced pretty much the same question.

The application allows the user (me!) to define hypotheses about EOD price patterns and then test those hypotheses with about twenty-five years of historical data. The hypotheses can refer to the prices directly or they can be based on indicators derived from the price data (moving averages, bollinger bands, stochastics, etc).

A typical run through the hypothesis-tester might execute hundreds of millions of calculations, over tens of millions of different price points for various securities in the system. So the damn thing had to be pretty fast.

As I wrote the code, I was very careful to query the database as rarely as possible, caching frequently-used data and always calculating indicator data from memory-resident copies of the prices.

Nevertheless, I noticed huge differences between different database vendors. I isolated the amount of time spent executing all of the queries, and I conducted an apples-to-apples test between MySQL, SQL Server CE, db4o, and a cute little binary database library I wrote myself. (The application was written in C#, so I used the ADO.NET provider for each DB vendor.)

For my benchmark, I fetched 650 securities at random from a database of around 6500 securities, and then pulled the complete trading history for each security. (On average, each security has about 2900 days of trading data in the DB, with a total of about 18.5 million rows in the 'quotes' table.)

Using those trading histories, I constructed 15 indicators (including simple moving average, exponential moving average, standard deviation, and relative strength index, as well as some nested combinations of those indicators).

Then I iterated through all of the day-by-day values for both the actual prices and the price indicators, running some comparisons, and feeding all the values into a function (to simulate the actual number-crunching that would take place in a real hypothesis test).

All of the benchmark timings were calculated using Microsoft's QueryPerformanceCounter and QueryPerformanceFrequency classes (in the System.Runtime.InteropServices namespace).

I compiled the binary with Release flags and with maximum optimizations and ran the benchmark ten times for each DB. Here are the median results:

SQL Server CE v 3.1
---------------------------
  total time: 487.3 seconds
  query time: 388.0 seconds
  non-query time: 99.3 seconds

MySQL 5.1.22 (Connector.Net v 5.0.6)
---------------------------
  total time: 214.8 seconds
  query time: 117.2 seconds
  non-query time: 97.6 seconds

DB4O (6.x.?)
---------------------------
  total time: forever
  query time: who knows? after a few hours,
    it still wasn't finished.

Benji Smith's Custom Database Engine
---------------------------
  total time: 109.7 seconds
  query time: 17.51 seconds
  non-query time: 92.2 seconds

I was actually really surprised to see that SQL server spent more than THREE TIMES as much time executing queries as the MySQL version. The schemas were virtually identical between the two implementations, with B-Tree indexes on the security_id and trade_date columns. I even spent extra time trying to tweak MS SQL connection parameters and buffer sizes to improve performance for, but I couldn't even get it into the same ballpark as MySQL.

Also, I didn't notice any difference in ease-of-implementation between MS SQL and MySQL. The query dialects are a little bit different, but that's no big deal. And I didn't find that Visual Studio made an iota of a difference in getting the different DAO layers in place.

DB40 was disappointing. The queries were dirt-simple to write (since they're just written as predicates in ordinary C# code), but the performance was abysmal. I'm sure they've improved things over the last year or two, but I was extremely disappointed at the time.

My own little experiment in writing a database was very educational. I designed a binary file format capable of handling an arbitrary number of columns (with about fifteen different data types), and my own B-Tree indexes.

I think the reason my own database was the fastest is because there didn't need to be any communication protocol between the DB and the application. The whole DB engine lived as a library inside my app.

Also, I implemented a few features that I've always missed from DB engines (Such as: a way to give hints to the storage engine about how to organize the records; two records likely to be returned in the same result set should be NEAR each other in the actual storage hierarchy. Event the best index can only tell use WHERE a record resides. If two records in a result set are far apart, the index doesn't help retrieve them more quickly. Positioning them in the same block on the hard disk DOES make the query faster.)

Aaaaand of course, I skipped implementing tricky stuff like thread safety and locking and...a query language. All of my queries were implemented directly in C#, with no pesky SQL parser in the way.

So it was a fun project, and it showed me just how much faster a database engine could be, if you use a few clever algorithms and skip out on all the fancy enterprisey features. But it's nowhere near ready to be used in a real product.

And besides, MySQL was pretty darn fast.

A few years prior to these benchmarks (before I ported the whole app from Java to C#), I also tested HSQLDB. It was reasonably fast (slower than MySQL but faster than MS SQL, if I remember correctly), but it couldn't handle multiple gigabytes of data. It always crashed, long before I could get my whole historical price data archive loaded.

I had also planned on testing Firebird and Berkeley DB, but then I got distracted with other projects and never got around to it.

Anyhoo... I hope that info helps. If you have any other questions, just holler. I'll see if I can fill in as many details as possible.
BenjiSmith Send private email
Wednesday, March 05, 2008
 
 
Can you run the benchmark with SQL Server Developer Standard or Developer Edition?  And any possibility you can do Firebird SQL Server also?
Nate Send private email
Wednesday, March 05, 2008
 
 
Unfortunately not.

Testing with the other MS SQL versions would require me to download and install them, and I know how much of a pain it can be to switch back and forth between MS SQL versions. And the firebird benchmark would require me to port my DAO layer to a new SQL dialect, and I'm certainly curious.

It'd be interesting, but since I'm not actively developing this product right now (and since I'm pretty sqamped with other work), it wouldn't be worth the effort to me.

Sorry :(
BenjiSmith Send private email
Wednesday, March 05, 2008
 
 
Hey Benji,

Thanks for the detailed response! Exactly what i was looking for!

Will shoot you an email offline.

best,
Shiny Happy Send private email
Monday, March 10, 2008
 
 
No problemo.

Fire away. I'll point my eyeballs at ye olde inboxe.
BenjiSmith Send private email
Monday, March 10, 2008
 
 
There is more options.

Largest problem with stock databases is that they get updated extremely frequent. I have been working with a stock site company who has a 15 minutes delayed updates. The streams of updates was massive. They had to either turn off transactions on their MySQL or find another option.

The problem is the number of disc writes your disc system can handle. So either you want to purchase an expensive disc system, or scale out over multiple machines (and disc systems).

I could suggest you to look at ScimoreDB Server/Distributed which can distribute over multiple machines, but still appear to your program as a single sql server.

It support the standard things - its free, it supports t-sql (except cursors), ado.net provider, has a query manager and is "optimized for windows".
Thomas Hilbert Madsen Send private email
Tuesday, March 11, 2008
 
 

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics
 
Powered by FogBugz