The Design of Software (CLOSED)

A public forum for discussing the design of software, from the user interface to the code architecture. Now closed.

The "Design of Software" discussion group has been merged with the main Joel on Software discussion group.

The archives will remain online indefinitely.

Database design question

My application needs data from an external provider.  This data contains multiple objects (10 or more), and zero to many remarks on the object in general and/or on specific properties of the object.  The objects are very straightforward; my question is about how best to model the remarks.

A few points about the app:
- needs support for SQL 2000 or better
- uses GUIDs for PKs
- some customers may write reports against the database, so keeping the structure as understandable as possible is a plus

Scenario 1:
Separate Remark tables (one for each object)

Contact
--------------
ContactID (PK)
FirstName
LastName

ContactRemark
--------------
ContactRemarkID (PK)
ContactID (FK)
PropertyName
Remark
Sequence


Scenario 2:
Single Remark table

Contact
--------------
ContactID (PK)
FirstName
LastName

Remark
--------------
ContactRemarkID (PK)
ObjectID (FK)
PropertyName
Remark
Sequence


Scenario 3:
Dual Remark tables

Contact
--------------
ContactID (PK)
FirstName
LastName
RemarkSetID (FK)

RemarkSet
--------------
RemarkSetID (PK)

Remark
--------------
RemarkID (PK)
RemarkSetID (FK)
PropertyName
Remark
Sequence


Comments about the above scenarios or alternate designs are GREATLY appreciated.  Thanks in advance.
BrandonK
Tuesday, October 17, 2006
 
 
I'm a little confused about the relationship between the contact and the object being referred to.

Let's say for the sake of argument you're writing a real estate application, and your objects are properties being sold. In this case, I like scenario 1 the best if you substitute the property for the contact, i.e...

table 1
========
PropertyID
PropertyAttribute1
PropertyAttribute2
ContactName
ContactPhone

table 2
========
RemarkID
PropertyID
RemarkDescription
Sequence

Scenario 1 as described by the original poster makes perfect sense if the contact IS the object that we're remarking on.
TheDavid
Tuesday, October 17, 2006
 
 
If a remark is specifically associated with a contact then I'd use a dedicated table for contact remarks, and likewise for other remarks associated with other particular entities.

What is PropertyName, btw?
David Aldridge Send private email
Tuesday, October 17, 2006
 
 
I'd use scenario 3, I think.

In scenario 1, we get 10 remark tables that are almost identical.  In scenario 2, we have the yucky generic column objectId.

my 2c
Mike S Send private email
Tuesday, October 17, 2006
 
 
And if necessary you could define XxxxxRemark views for your end-users to query...
Mike S Send private email
Tuesday, October 17, 2006
 
 
-- I'm a little confused about the relationship between the contact and the object being referred to.

I was using a Contact as an example of an object being imported (the actual objects are business specific and would require too much explanation).  There are at least 10 other objects being imported

-- If a remark is specifically associated with a contact…

A remark can be associated with any of the objects being imported.  I listed only the Contact for brevity.

-- What is PropertyName, btw?

Remark.PropertyName records the name of the Object.Property.


An example of the data being imported (from a flat file):

Contact1, “Fred”, “Flintstone”
Remark1, Contact1, “”, “Lead rock crushing technician at quarry”, 1
Remark2, Contact1, “”, “Special expertise in volcanic activity”, 2
Remark3, Contact1, “FirstName”, “Becomes agitated when called ‘Frederick’”,1
Remark4, Contact1, “HomePhone”, “Asked not to be called after 8pm”, 1
Contact2, “Barney”, “Rubble”
Remark5, Contact2, “”, “Won ‘Happiest Employee’ award for 2006BC”
BrandonK
Tuesday, October 17, 2006
 
 
Should have also added that a Remark is associated with only one object instance.  So a Remark for Fred Flintstone would not be linked to other Contacts, or other object types.
BrandonK
Tuesday, October 17, 2006
 
 
I'd go with scenario 1, with a twist: every object has its own table and its own remarks table.

The rationale: when dealing with only 10 objects (or that order of magnitude) it's not really worth the time and effort to develop "smart" tables such that a remark knows which table it is associated with at the record level.
TheDavid
Tuesday, October 17, 2006
 
 
Before you can answer the question, answer this question: what are you trying to optimize for? Speed of retrieval? Space? Referential Integrity?

That answer then would determine what the original question's answer should be.
Steve Hirsch Send private email
Tuesday, October 17, 2006
 
 
Thanks to everyone who's replied so far!  Great posts!

My design goals, in order of importance, are:

1. Data integrity
2. Speed of retrieval
3. Ease of understanding
4. Disk space

#3 is for customers who want to write their own reports.  Some customers have dedicated IT depts with report writers who understand relational design.  Other customers are have power users with moderate knowledge of Access.  While good design is important, I don't want to blow them away with an overly complex solution that they call customer support about.

Other design considerations:
- The data is reference only; end users can view but not edit the data.  The app uses the data for lookup and validation logic.
- The data set gets updated once a month.
- I'd estimate < 100K total object records per update, with < 750K total remark records
BrandonK
Tuesday, October 17, 2006
 
 
For your goals, I'd go with a global remark table. This is particularly important for reportin as it can easily allow pivoting.
Sevenoaks Send private email
Tuesday, October 17, 2006
 
 
>>For your goals, I'd go with a global remark table. This is particularly important for reportin as it can easily allow pivoting.

Yuk. Common lookup tables of this type are a very bad thing. They obscure cardinality information from the optimizer and do not make queries any easier. They are a lazy person's way of sabotaging performance, and maybe integrity as well.
David Aldridge Send private email
Tuesday, October 17, 2006
 
 

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics
 
Powered by FogBugz