The Design of Software (CLOSED)

A public forum for discussing the design of software, from the user interface to the code architecture. Now closed.

The "Design of Software" discussion group has been merged with the main Joel on Software discussion group.

The archives will remain online indefinitely.

DB for shared server then dedicated server?

Three-stage plan.
1. Create and run db system on a shared server (assumed lower cost, less control), generate enough revenue from system to justify dedicated server
2. Migrate to dedicated server
3. Run on dedicated server (assumed higher cost, more control)
Which DB engine would do well in all stages? Any frontend recommended?

Platform not that important, comfortable with both unix and windows. Might prefer Windows for development work. TIA.
an' on her mess
Monday, November 27, 2006
 
 
Actually, probably prefer unix for server.
an' on her mess
Monday, November 27, 2006
 
 
Right now, pretty much any database will work.

With the exception of Microsoft Access (which is probably not a good choice for other reasons), every significant production-ready database is network aware. Your front end simply needs to know the database's network address in order to utilize it, and it doesn't matter whether that address is the same machine or another, seperate machine.

When the time comes to move to a dedicated server, simply "configure" the front end to connect to a different network address.
TheDavid
Monday, November 27, 2006
 
 
Aren't some DBs better than others? MySQL doesn't have triggers, f'rinstance.
an' on her mess
Monday, November 27, 2006
 
 
>> Aren't some DBs better than others? MySQL doesn't have triggers, f'rinstance.

They're different, that's all -- different architectures, different capabilities, different isolation levels and locking models, different features included etc..

Personally although I'm an Oraclista I believe triggers to be the work of the devil for business rules and whatnot -- for auditing maybe, but that's about it.
David Aldridge Send private email
Monday, November 27, 2006
 
 
"Aren't some DBs better than others? MySQL doesn't have triggers, f'rinstance."

+1 to David Aldridge.

MySQL sacrifices triggers (and various other things) for speed and is probably the ideal database for circumstances where you'll do arguably very few "inserts" and a lot of "selects" in comparison.

Wikipedia runs off of MySQL.

On the other hand, if I had to build something like a single database for my state's Department of Motor Vehicles, I'd want something that provides more integrity and performance tuning tools at the database level, like Oracle's.
TheDavid
Tuesday, November 28, 2006
 
 
"MySQL sacrifices triggers (and various other things) for speed and is probably the ideal database for circumstances where you'll do arguably very few "inserts" and a lot of "selects" in comparison."

Mysql 5 - released over a year ago - has triggers, stored procedures, etc.
KC Send private email
Tuesday, November 28, 2006
 
 
I honestly thought that was a variant in the sense that you had to decide upon installation whether you wanted to incorporate them or not?
TheDavid
Tuesday, November 28, 2006
 
 
"MySQL 5 has triggers, stored procedures etc"

Well that certainly suggests one possible path
1. On the shared server, take whatever version of MySQL the ISP deigns to support (commonly 4.x)
2. Migrate to MySQL 5
3. Install/run MySQL 5 on the dedicated server

However, it strikes me that step 2 would be "non-trivial" ;) I'll definitely be using triggers since certain db ops eg. INSERT will be charged for, and a transaction table will be added to. Using MySQL 4, I'd have to implement "triggers" in client-side code before replacing them with MySQL 5 triggers. Seems like a lot of work for a Lazy Programmer. Comments?
an' on her mess
Tuesday, November 28, 2006
 
 
>> Comments?

Don't use triggers.
David Aldridge Send private email
Tuesday, November 28, 2006
 
 
Yes, triggers were bought into the discussion merely as an example. Ideally you'd like to keep all of your business logic together in one place so that it's easier to test, modify and/or fix.

Transactions, commits and rollbacks are (as a group) probably the only "advanced feature" I'd research and practice with before I decided upon a database, if I needed that capability. In my experience which is admittingly not that deep and wide, everything else can be done either in the application or you'd want a good DBA to oversee anyway.
TheDavid
Tuesday, November 28, 2006
 
 
I want to automatically trap INSERTS on some tables and add to the transaction table. Transaction codes, fees etc are all looked up, so it's completely automatic. If I don't use triggers, how?
an' on her mess
Tuesday, November 28, 2006
 
 
>> I want to automatically trap INSERTS on some tables and add to the transaction table. Transaction codes, fees etc are all looked up, so it's completely automatic. If I don't use triggers, how?

You have to write the code to do this anyway, so do it in the application. If necessary encapsulate the inserts in a single procedure that includes the additional code that you're thinking of putting in a trigger.

Otherwise bugs are going to be dreadful to track down -- every time a row is inserted/deleted/updated the code maintainer has to think about what triggers there are, whether they fire, what they do, whether their actions cascade onto other triggers ... it's a nightmare. Use triggers only for non-business functions.
David Aldridge Send private email
Tuesday, November 28, 2006
 
 
>> Transactions, commits and rollbacks are (as a group) probably the only "advanced feature" I'd research and practice with before I decided upon a database, if I needed that capability.

I'd add to that locking, transaction isolation levels, and scalability features.
David Aldridge Send private email
Tuesday, November 28, 2006
 
 
I followed a similar plan with requirements for: fast performance, runs on Windows, and low cost.  I evaluated MySql, PostGreSql, SqlServer and decided on SqlServer.

PostGreSql was just too slow on Windows.

MySQL can run > 5x faster than SqlServer but only if using a non-journaled file system, which brings the risk of data corruption when power fails.  With a journaled file system the performance was about same as SqlServer.

SqlServer had best overall features and price was OK, as SqlServer Express is free and supports DBs up to 1G (beyond that I have enough revenue to buy SqlServer licences).  But performance was still way too slow. 

Testing on my desktop PC showed that disk I/O was the major bottleneck (for MySql too).  I messed around with faster disks, disk caches, RAID controllers, etc.  I ended up on EBay buying a Dell 2650 server with 15K drives in RAID 0+1.  This server is 3 years old, but CPU never exceeds 10% and provides I/O performance that is just not possible on a desktop computer.
G Send private email
Wednesday, November 29, 2006
 
 
"I'd add to that locking, transaction isolation levels, and scalability features."

I'll also add such "advanced" features as referential integrity and data validity.  Personally, I dont think '0000-00-00' is a valid date on this planet, do you?

And before somebody tells me "5.0 has it" I'll retort with "yeah, but is it in the default table-type MyISAM?" followed by "Since it doesn't, is the table-type that does, InnoDB, as fast as Postgres or others?".  To which I'll expect silence.

http://sql-info.de/mysql/gotchas.html#1_14

Friends don't let Friends use mysql.
Cory R. King
Wednesday, November 29, 2006
 
 
>> Personally, I dont think '0000-00-00' is a valid date on this planet, do you?

Nor does Oracle ...

SQL> select date '0000-00-00' from dual;
select date '0000-00-00' from dual
            *
ERROR at line 1:
ORA-01843: not a valid month


Are there databases that do?
David Aldridge Send private email
Thursday, November 30, 2006
 
 
Before this becomes a flame war, I'm not so much concerned with what features are in a databases verses which features are not, as much as I am concerned with the application developer's ability to fake any missing features.

Referential integrity and data validity are things you can do at the application level. Verifying that a user hasn't selected a date of 0000-00-00 is laughably trivial. Ditto for foreign keys and so forth.

Transactions can also be coded in the application, but it's extremely difficult to get right due to the possibility of race conditions, database errors and other edge conditions. Scalability is something you pretty much can't compenstate for at the application level once you've reached a certain point; either the database or the hardware has to improve.

However, I think G has the right idea even if it contradicted my own. The best solution really is to start with something simple (MySQL) and test it. If you find it lacking and you can't code a workaround, try something more powerful (Postgre). As your application's feature set grows, and you start to grow out of Postgre, start testing the big boys (SQL Server and Oracle). And so forth.

What he didn't say and probably should have said is that Oracle is not for the "newbie". Trying to design a new application, set up a new web site and become proficient with Oracle all at the same time, is a bit too much to handle.
TheDavid
Thursday, November 30, 2006
 
 
"Referential integrity and data validity are things you can do at the application level. "

You are damn near cursing yourself writing that you know.

The only people who say stuff like that are the people who haven't gotten burnt big time when they discover their magical perfect application has silently been corrupting their data. 

Every app has bugs, even yours.
Cory R. King
Thursday, November 30, 2006
 
 
By the way, in my experience, it is very hard to move out of MySQL. 

I'm very vocal about not using mySQL.  I've been badly, badly burnt while migrating my application to PostgreSQL.  I found all kinds of broken references with orphan rows, trashed dats, missed constraints, and separate accounts for "MaryJane" and "maryjane".  Yes, mysql is case insensitive for queries, but it internally considers those different values...  who would have thought eh?  Guess I forgot to add *that* application level data validation check!  I had to decide if "MaryJane" keeps her account, or "maryjane".  Some of those dupes were even paying subscribers!

Now that I've moved, I've gutted hundreds of useless "SELECT * FROM table" queries and replaced them with very specific ones.  I don't bat an eye doing a fifteen table join with complex subqueries, I *know* they will come back in mere milliseconds.  As I result, I have been able to remove a TON of client-side caching with memcached I had to do with MySQL and it's 100 queries per page view.  Loading the front page of my webapp now takes 5 queries instead about about 100.  Literally one giant query returns almost all of the data required for that page and never once has it taken more then about 50 milliseconds.  Good luck with that using MySQL.

I've got my views and even views inside views.  I've got a really nice query analyzer that tells me why my queries are slow.  I use schemas all the time.  I've got stored procs.  I've got triggers to do a few things that might, gasp, be hard to do in my application. I appreciate not having to worry about my perl application silently corrupting my data.  It complains loudly if I try accidentally break referential integrity on the command line.  I can wrap EVERY SINGLE schema change I make inside a transaction (yes, you can roll back CREATE TABLE's in Postgres) so I can bail if it screwed up mid-change.  Unlike MySQL, it even refuses to let me insert a NULL value in a column with a NOT NULL constraint* (this was a HUGE source of corruption when migrating off MySQL).

Best of all, I dont need to worry about my upcoming .NET app corrupting my data with bugs I don't know about yet.  I can't exactly port my perl libraries over, you know.

I'm happy as a pig in #%&8 with Postgres.

As far as I'm concerned, unless you are writing software like Wordpress that needs to run on stock webhosts, there is zero reason to use MySQL.

* http://dev.mysql.com/doc/refman/4.1/en/data-type-defaults.html
Cory R. King
Thursday, November 30, 2006
 
 
Well, it's obvious MySQL is not for Cory.  :)  It's still perfectly valid for people who don't have the same requirements as Cory does. For example, while I'm glad he love Postgres, G complained it was too slow for his needs.

Please keep in mind the original poster. The OP didn't even know if their three step plan would work. Not to belittle them, but it's incredibly unlikely they were qualified to analyze and compare databases to the same degree that Cory admirably did.

I still stand by my advice for them: if you need transactions, pick a database that supports them in a fashion you understand. For every other feature, plan on learning how to do them at the application level until you get more experience with how databases work.
TheDavid
Thursday, November 30, 2006
 
 
"incredibly unlikely they were qualified to analyze and compare databases"

True. My experience with SQL is limited to MySQL. I'm open to expanding my horizons and in fact this project is an opportunity to do precisely that. For me, getting from here to there is the thing. I've saved this thread (a few times, now) for reference.
an' on her mess
Thursday, November 30, 2006
 
 
"Referential integrity and data validity are things you can do at the application level. "

A database is a model of a real world environment. If something cannot happen in the real world then the RDBMS should prevent it from being represented in the database. For example, you can't have invoice details without an invoice -- you can't have a negative sale for particular transaction types etc.. You ignore this at your own peril.
David Aldridge Send private email
Friday, December 01, 2006
 
 
"I'm open to expanding my horizons and in fact this project is an opportunity to do precisely that."

To spare you another long diatribe from yours truly, I'll just say that you are better off learning database design on a platform that encourages good practices.  Install issues aside, "Real" database systems are just as easy to learn with as MySQL.  You dont need to use all the wizbang features found in Postgres, Oracle or Microsoft SQL, but you'll appreciate those features being there when you gain enough experience to start using them.  You can also take comfort in the fact these systems will *not* let you do the wrong thing, even if you dont understand yet *why* it is wrong.

MySQL makes it very easy to learn bad habits that will be very hard to back out of as you gain experience down the road.
Cory R. King
Friday, December 01, 2006
 
 
"Referential integrity and data validity are things you can do at the application level"

Oh dear.
Database purist
Saturday, December 02, 2006
 
 
David Aldridge said

"Don't use triggers."

Why not? Really, I'm genuinely interested.
Alvin Rhodes
Saturday, December 02, 2006
 
 
I never use the database for anything more than an advanced flat file.  Anything more else is evil.  Why?

Because DHH and the Rails crew said so.  Because the guy down the hall used triggers to send email from the database server.  Because my cousin designed an entire template library using stored procedures.  Because my old frat brother was validating IP addresses using a reverse lookup inside a stored procedure.  Because my friend wrote an webapp and found out "the database was the bottle neck" even though he was making dozens of queries per page load to work around the shortfalls of his database vendor.  Because the guy who mows my lawn thinks referential integrity just "slows down the database".

Yes, I never use the database for anything more than an advanced flat file.  Anything more else is evil.  Let me cover my ears now and go back to my cube....
Dr. Cory R. King, RDBMS
Saturday, December 02, 2006
 
 
>> Why not? Really, I'm genuinely interested.

It makes ongoing maintenance and support really difficult. When the application code is being read to understand what it does and how it does it then for every update, insert or delete you have to consider the possibility that there is other code triggering that does something else. That's not as much of a problem if it is just an auditing/logging function, but if it is updating a value elsewhere (in the same table or a different one) then you can be in a world of bafflement.

The other side of the coin is that you can be looking at a value in a table that might be updated by a trigger. You then have to find all the code that can cause the trigger(s) to fire and see what they do and when.

A better solution is to encapsulate the update/insert/delete in it's own procedure (like a table API) and include the trigger code in there, then let that be called by other parts of the application.

Also, people seem to be allergic to putting comments in triggers.
David Aldridge Send private email
Saturday, December 02, 2006
 
 
So a trigger should be readonly with respect to the database (or if it updates a table, that table should be logically separate from the database or preferably in a different database)?
an' on her mess
Saturday, December 02, 2006
 
 
Yes, it should be "application neutral", if that makes sense.
David Aldridge Send private email
Sunday, December 03, 2006
 
 
Yes, it does.
an' on her mess
Sunday, December 03, 2006
 
 
Migrating from any database version X to X+1 is always going to be "interesting".

But MySQL (for all its other flaws) does meet the requirements actually listed in the OP.


> When the application code is being read to understand what it does and how it does it then for every update, insert or delete you have to consider the possibility that there is other code triggering that does something else.

Triggers and stored procedures are part of the application.

If many applications touch the same database, then they become equivelent to a DLL that's shared between several applications.

(DLL hell / "trigger hell" being a good reason to moderate use of stored procedures and triggers, DLLs, or other code shared between applications. The "application neutral" theory has the advantage that buggy triggers don't damage the useful data.)

Even if a business rule such as "when X gets below 100 generate a purchase order" is coded in raw C code, everything that touches X can cause the code to execute, and this *should* happen without needing to remember to insert if( X < 100 ) runSpecialRule(); all through the application.

Naturally there needs to be some process for ensuring that the triggers and stored procedures on the production database go through the same QA checks that "code" does because editing them on the fly is no different to letting the developer install their IDE on the production server to directly compile new code for the server.


> Also, people seem to be allergic to putting comments in triggers

That could make me support the death penalty.  ;)

Seriously, though, the reason for the lack of comments is that people don't think of triggers as being as important as "real code", and thus misuse them horribly and skip all the safety checks that other code gets. (At least, in theory.)

Wednesday, December 06, 2006
 
 
Actually, that should be if( X < 100 && oldX >= 100 ) runSpecialRule(); so it doesn't fire too often. Unless it should. And if the rule changes with 500 copies spread through the code and 3 developers currently actively working on the product, may god have mercy on your soul.

Wednesday, December 06, 2006
 
 

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics
 
Powered by FogBugz