The Design of Software (CLOSED)

A public forum for discussing the design of software, from the user interface to the code architecture. Now closed.

The "Design of Software" discussion group has been merged with the main Joel on Software discussion group.

The archives will remain online indefinitely.

One table per name/value or single table?

For simple enums (by enum I mean a name/value pair), i.e., things like Gender = {male, female}, States = {ny, ca, etc.} are there any downsides to having multiple tables, i.e., a Genders table that has two records and a States table, vs. a single Enum table? 

By assumption, I don't want to use a natural key (i.e., store 'male' or 'female' in the foreign key).

I prefer, and would presume one table per enum is generally preferred, but is there any potential performance implications of having lots of enum tables , that could easily double the number of tables in the database, vs. 1 incremental table?

Thanks.
bglenn
Monday, December 04, 2006
 
 
I don't know what the performance implications are of one table per enum, but I've never seen that (admittedly logical) arrangement implemented. In every shop I have ever worked in there has  been one giant central table (usually called something like T_CODES or CODEVALS or somesuch) that has at least three columns: CODE_TYPE, CODE_NAME and CODE_VALUE.

I've never thought that this was a very good arangement, since it is more difficult to code constraints against (rather than say that a field value must appear in the appropriate enum table, you need to say that the field value must appear in the codes table where CODE_TYPE is some specific type), but it is a common design pattern. I assume that there is some technical reason to do it this way and that it doesn't hamper performance too much.
Jeff Dutky Send private email
Monday, December 04, 2006
 
 
Instead of using a foreign key constraint, another option is to use a CHECK constraint: http://discuss.joelonsoftware.com/default.asp?design.4.366337.13#discussTopic366610
Christopher Wells Send private email
Monday, December 04, 2006
 
 
You could use a CHECK constraint, but why bother? Using separate enum tables is clearer, requires less work to define the constraints, and makes the relationship clear in the database diagram.

I've heard of the One Big Table approach but never seen it in the wild, and I have no idea what advantages (if any) it has.
clcr
Monday, December 04, 2006
 
 
I've done the multiple-table approach, and I've worked at a place that did the single-table approach.  I much prefer multiple tables.

One problem you can get with One Giant Table is that if it turns out you need an extra field later (like, say, a sort column), now _all_ your enumerations will have that field, and not only the ones it applies to.  Or else you'll have to break that set of values out to a separate table...which you wouldn't have to do if you'd started with multiple tables in the first place.

One Giant Table works better if you can abstract it out so no one has to know all the data is coming from One Giant Table.  But I still prefer the flexibility of multiple tables.  The obvious downside is having all those tables all over the place, but a good naming convention alleviates that difficulty.
Kyralessa Send private email
Monday, December 04, 2006
 
 
Performance wise, the one table approach is probably better since the database can essentially cache that table.  However, just as there are scenarios where you don't want to use 3NF, there are valid "human" reasons for not wanting to lump every enum into a single table.

States are a perfect example; if I have States and Countries intermingled into a single table, it becomes that much harder for someone else to come onboard and maintain my project. There are also, presumably, enums that are easier to name if you relied on both a name and a context as a table name.

As a stupid example, if you had one table, you'd need ALLOWABLE_PLANT_GENDER, ALLOWABLE_HUMAN_GENDER, ALLOWABLE_MINERAL_GENDER, ALLOWABLE_ALIEN_GENDER and so forth.

Alternatively, if you had PLANT, HUMAN, MINERAL, and ALIEN tables, you can simply have one ALLOWABLE_GENDER field in each table and it would be easier to create the appropriate "objects" in the application.

I vote for the middle ground where you have contextually grouped tables (locations, people related, product related), each of which has any number of enums.
TheDavid
Monday, December 04, 2006
 
 
I vote for what Kylaressa wrote: you never know which of your lookup tables will need additional fields. That's killer.
Peter {faa780ce-0f0a-4c28-81d2-3667b71287fd} Send private email
Monday, December 04, 2006
 
 
Multiple tables are the logically and practically correct implementation. Aside from the constraint problem, throwing everything into a single table makes query optimisation more difficult by hiding the cardinality of the lookup. When joining to a large number of records in the lookup you are probably also stuck with an index-based method instead of a more efficient full table scan.
David Aldridge Send private email
Monday, December 04, 2006
 
 
"Performance wise, the one table approach is probably better since the database can essentially cache that table"

Why can't it cache multiple tables? No reason at all.

Performance-wise one table is the worst option by far.
David Aldridge Send private email
Tuesday, December 05, 2006
 
 
This question highlights the potential difference between a logical and physical model.

If you can properly abstract the multiple tables away from the developers, then I'd vote multiple tables.

If a non-database person will be in the db alot, I'd lean towards one big one.

Alot of things can change depending on the amount of CRUD going on however, YMMV.
D In PHX Send private email
Tuesday, December 05, 2006
 
 
"If a non-database person will be in the db alot, I'd lean towards one big one."

Why?

:)
David Aldridge Send private email
Tuesday, December 05, 2006
 
 
All the more I'd go with multiple tables if a non-DB person will be in there a lot.  With the One Giant Table model, a careless UPDATE or DELETE statement could be disastrous.
Kyralessa Send private email
Tuesday, December 05, 2006
 
 
This is more of a philosophical issue. I've seen it done both ways, sometimes at the same time. My preference is to have these in 1 or 2 tables (TypeCodes and StatusCodes if there are 2), rather than in 50+ tables. But again, as Kyralessa says, if your customer/maintainer is unskilled, separate tables for each and every look up table.

A list-order column is a smart idea. There are lots of times when someone wants to re-order the items in some drop-down, and a list oder column makes such a change rather quick.
Peter
Tuesday, December 05, 2006
 
 
"This is more of a philosophical issue"

Well, no it isn't. Not if shoving everything into a single table is harmful for performance and makes it more difficult to use the correct data type. One table is the lazy way, and it's logically and physically incorrect.
David Aldridge Send private email
Tuesday, December 05, 2006
 
 
I learned all this stuff on the fly and often took bad advice as I was learning. Here are the two most important things I've learned about databases:
1. There is no such thing as a lookup table in the sense I think you mean. If the entity definition really, absolutely never can change, then you must be looking further into the future than I can. Almost every 'lookup' I've ever created eventually had other fields added (i.e. modified entity definition). If the data itself (i.e. Male, Female) really can never change, then there's no need for a table at all, is there?
2. Any attempt to 'super-normalize' ultimately leads to using a 'real' RDBMS to implement a custom RDBMS. Leave the fancy footwork to someone else. Super-normalisation is the result of waking up one morning with the amazing insight that everything is 'just' a lookup of a particular type and that type is still just a lookup. (see 1).

I'm sure someone can tell me where I've gone wrong, but it works for me. I say go with separate tables.
Ron Porter Send private email
Tuesday, December 05, 2006
 
 
Consider a system where many such entities can be grouped into what may commonly be referred to as Lookup Items (which reside in Lookup Categories) eg:

Category ( Example items )
---------------------------
Title ( Mr, Miss, Mrs ... )
Occupation (Enginner, Pilot, ...)
EmploymentStatus (Temp, Fulltime, Contractor, ...)

If your app is such that no decisions are ever made based upon the values of the lookups you have chosen (ie they aren't involved in any business logic) then one advantage of using the One Big Table approach is that it makes your Content Management for adding/editing lookup items quite simple to implement.

You would simply have two domain objects: LookupItem and LookupCategory, and one screen which allowed you to add to, remove from, or edit values within these categories.

Contrast this with having to implement domain objects and CRUD code for each of these entities as well as an Add Title screen, an Add Occupation screen, an Add Employment Status screen, etc.

It can save on development time and is quite flexible, but it is not a fit for every situation.

I have worked on an enterprise framework which used this approach, and many enterprise systems utilising this framework and it worked very well.
One table to rule them all Send private email
Tuesday, December 05, 2006
 
 
Fantastic!!

If that's what is used in an 'enterprise framework' then that's obviously the way to do it.
Database purist
Tuesday, December 05, 2006
 
 
"Contrast this with having to implement domain objects and CRUD code for each of these entities as well as an Add Title screen, an Add Occupation screen, an Add Employment Status screen, etc."

You wouldn't need to implement separate code for each entity. Just write one piece of code and parameterize it with the database table name and the entity name to be displayed on the form (Title, Occupation, etc). It might not be Enterprisey but it's simple, easy to maintain, and preserves the ideal data structure.
clcr
Tuesday, December 05, 2006
 
 
"I have worked on an enterprise framework which used this approach, and many enterprise systems utilising this framework and it worked very well. "

And on many applications, especially enterprise ones, this is a serious design error that is regretted later when the database doesn't perform.
David Aldridge Send private email
Tuesday, December 05, 2006
 
 
I think you need to ask yourself what do you plan to do with the application.  Is it for a particular client or will it be canned.  I had been using one table per enum generally, but then I was poking around in a database of some commercial software I needed to hook into, and found this one table approach and then saw its advantage in some cases.  This software package was huge but not intended to be one size fits all.  During setup lots of tweaks and modification were done on site by the consults setting up the program.

Low and behold when I looked at these tables (actually there were a few to rule them all) I found that what they where doing were adding lots of these enums on the fly.  The moral of the story is that if this is a package that will frequently have custom modifications made then implementing one table will be a lot easier to support than adding tables every time you need to do a custom change.
Terry Mc
Thursday, December 07, 2006
 
 
" "I have worked on an enterprise framework which used this approach, and many enterprise systems utilising this framework and it worked very well. "

David Aldridge > And on many applications, especially enterprise ones, this is a serious design error that is regretted later when the database doesn't perform."

I'll repeat:

It can save on development time and is quite flexible, but *it is not a fit for every situation*.
One table to rule them all Send private email
Thursday, December 07, 2006
 
 
I've used the "one big table" approach for one thing and one thing only - status codes.  It's been tremendously useful to me over the years, and anyone who has had to maintain a large and highly organic database could probably guess why.  While this is the only example that has ever been useful to me, I can't say that there aren't others.

To say that it is *never* prudent to use a single table that is shared across many domains isn't just narrow-minded; it's dumb.
prudent
Friday, December 08, 2006
 
 
Then I'll repeat also ... "on _many_ applications ... this is a serious design error"

"To say that it is *never* prudent to use a single table that is shared across many domains isn't just narrow-minded; it's dumb. "

There is almost no example of bad design practice that cannot be excused by applying it to trivial situations -- very small applications, ones with hardly any activity, ones with very specifically limited operation (eg only ever doing single row lookups), ones with deadlines that require expediency over correctness. However that does _not_ turn them into good design practice.
David Aldridge Send private email
Friday, December 08, 2006
 
 
Then I'll repeat also ... "on _many_ applications ... this is a serious design error"

"To say that it is *never* prudent to use a single table that is shared across many domains isn't just narrow-minded; it's dumb. "

'There is almost no example of bad design practice that cannot be excused by applying it to trivial situations -- very small applications, ones with hardly any activity, ones with very specifically limited operation (eg only ever doing single row lookups), ones with deadlines that require expediency over correctness. However that does _not_ turn them into good design practice.'

To which I add that a useless program gets thrown away, and a useful gets kept and probably changed.  If I need to expand one of the lookups to include another column, now, I have a rewrite on my hands.  I find it much easier to design a separate table and modify the code for maintaining it according to the differences.

Sincerely,

Gene Wirchenko
Gene Wirchenko Send private email
Friday, December 08, 2006
 
 

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics
 
Powered by FogBugz