The Design of Software (CLOSED)

A public forum for discussing the design of software, from the user interface to the code architecture. Now closed.

The "Design of Software" discussion group has been merged with the main Joel on Software discussion group.

The archives will remain online indefinitely.

Entity, Attribute, Value database design and a call for help?

I've created this db schema in Firebird that has essentially only 4 tables.

1. THINGS - Customers, Employees, Documents, Jobs, Invoices tc
2. IDENTIFIERS - identifies a thing with a "code" ie: INV123 (probably a redundant table, but there you go)
3. ATTRS - holds attribute names - ie: "Invoice Number"
4. ATTR_VALUES - holds the value of the attribute - ie: "12345" (if ATTR = Invoice Number)

I link things together with the name of the attribute and append the text "_IDENTIFIER", so a Customer's list of attributes may be:

First Name
Last Name
Address etc
PAYMENT_IDENTIFIER etc - you get the idea.

When I see an attribute with the _IDENTIFIER suffix, I extract the first portion (the name of the THING, ie INVOICE or PAYMENT) and then grab all the attributes for that thing, given its IDENTIFIER code (that's why I have the IDENTIFIERS table! I knew there was a reason.)

Presenting normal "grid" lists is achieved by having  support tables called LIST_COLUMN_PROFILE which holds the name of the list ie: Customer List (Default) and LIST_COLUMNS, which holds a row for each attribute that should be in this list. A stored proc (dynamically created for that particular list) then retrieves the attribute values and returns them in a regular recordset - 1 row, multiple columns.

This also means that the end user can create a new list profile of just the info they want to see. ie: The default invoice list shows invoice number, date, amount etc, but the end user may want to see just invoice number and amount - the dynamic generation of the stored proc allows this. (The grid I put the data in also allows for sorting on any column.)

Searching within a list, or filtering is achieved by letting the users select from a list of attributes, and then selecting just the values that are already in the database for that thing/attribute combination. Think SELECT DISTINCT etc.

At the moment, I've only used equality as a comparison - ie: if this attribute = this value. I haven't added other comparison operators yet, such as greater than etc.

I've also added another table called ATTR_COUNTS that, whenever a new attribute and value is added for a particular thing, I increment the count. I can then let the user select a thing, and see the top ten attributes and counts. ie: selecting the SALES thing shows the attributes for SALES - selecting one of its attributes, say PAYMENT METHOD (and payment method is a string like VISA, MASTERCARD, DINERS, CASH, CHEQUE, MONEY ORDER etc) causes the graph to show which payment method is the most popular. And its pretty quick too because the counts are pre-populated.

What else?

Oh yes, it uses Jabber to communicate between running instances of the app. If the app on say PC "A" receives a message (of a properly formatted JEP-0004 - see Data Forms) then it can use that data to lookup information in the db, and send the results back. I've tried it with a normal Jabber client, and it works like a treat. So basically, you set a flag in the ATTRS table which says that this attribute is "published" (meaning people can search on it) and PC "B" can then search PC "A"'s database for say Inventory. Using Jabber also allows PC "B" to send a message that is formatted (and I havn't gotten this far yet) as a purchase order, using XBRL. PC "A" can then parse the purchase order etc......

I'm thinking that gives the small business operator a simple way of being able to maintain inventory, and receive orders into a "single" system, rather than them need to have an accounting system, say Quickbooks or MYOB with inbuilt inventory, which is fine, but then they also need to have an online website that customers can buy from. Upload QB inventory to website. Have shopping cart etc. Pain pain pain. My thoughts are for the small business operator to maintain a website, but offer the "customer" version of this application as a free download so the customer can search the small business inventory with the local app instead of a web browser etc. And then programmatically, via the local app, create / access Paypal for purchasing stuff etc.

(I guess this last part should really have been posted in JOS Business of Software, but the two are quite closely tied together - how the software works, and how it can be used.)

I want to make this open source too, but its not quite there yet. see - there's a couple of documents there, but no source code yet.

Anyway, I guess the reason for the post is to ask if anyone else has done anything like this, and perhaps we could collaborate or bounce ideas off one another?

I feel very small and alone with no-one to talk to about the ideas I have.  :)

Thanks for reading this far. BTW - I am a VB6 Classic developer - have been for years, and see no reason to change..... :)

Brad Thomas Send private email
Thursday, April 13, 2006
So, you aren't going to have any attribute values which aren't text?
Database purist
Thursday, April 13, 2006

No. :)  The column is defined as Varchar(128).

Its still very much in an embryonic form at the moment - although I have used it to creaste some good things (not publicy showable / viewable though :(  )

My previous design had this horrible setup where each value was stored in a column of that particular data type, but that became far too unwieldy, so I tried to simplify it a lot by having just a varchar(128) column.

I've also created a Visual Basic user control that has a property called DataSubType - possible values are Text, Memo and Blob, meaning

Text - fits in the standard varchar(128) limit
Memo - the content of the usercontrol etc gets stored in a table called ATTR_MEMO where each column is defined as varchar(32767)
Blob - the content of the user control is stored in a table called ATTR_BLOB where each column is defined as a BLOB column.

Anyway, thanks for responding. I appreciate it.
Brad Thomas Send private email
Thursday, April 13, 2006
Well, a VarChar field stores text, doesn't it?

This looks very bad to me. You've asked for opinions. Stop this now and do a proper database.

You're using tables, with relatinships set between them. So you're making a relational database. Whether you like it or not. So design the database like it's meant to be. It looks to me like all your user data (as opposed to metadata) is stored in one column of one table (though it turns out maybe more, now that you've 'realised' that you can't get a memo sized lump of text in a varchar).

What are you going to do when dates need to be stored? Integers? Floating point numbers?

It looks to me like it's a fairly standard CRUD app to me. Customers, employees etc. Design the tables like that. You're trying to come up with a 'simple' table design that will allow you to store 'anything'. It will be a god awful mess.

Of course that's just my opinion.
Database purist
Thursday, April 13, 2006
I concur wholly and completely with Database Purist. Stop now! You're creating much more work for yourself both in the short and long term than if you just used standard relational design.
Chris McKenzie Send private email
Thursday, April 13, 2006
I chased the EAV model about 6 years ago. I've since grown up. You should too. They suck, for too many reasons to go into here. Google should tell you.
Thursday, April 13, 2006
At first this type of thing appears clever.  Then you try to create a report or something more sophisticated and it is very unwiedly.  The performance and scalability is going to be abyysmal and the maintenence is yeach. 

Don't do it.  Many people have gone down this road and gotten royally bitten.  Design a proper ER model and use that.  You have no hope of any referential integrity, any use of native datatypes (eg dates, numbers, and binary types).  Mapping those things to character types is a very bad practice and will create huge problems down the road.  In VB you don't restrict yourself to just character variables of 128 bytes, you would laugh and make fun of any programmer who did that. (and rightly so)
Jim Send private email
Thursday, April 13, 2006
An identifier is a thing. Why doesn't it go in the things table? Same for attributes, and attribute values. All things. Why not just have one table called tblThing? Columns would be ThingID, ThingName, ThingValue. All done.
Larry Lard Send private email
Thursday, April 13, 2006
This works fairly well for user-added attributes that are only *ever* used for reporting.  If you want to use them for non-trivial processing you'll spend a lot of time and cycles converting them to/from proper datatypes.  If you have real, identifiable entitites you should define them properly.

I say this working for a company whose product allows users to define an unlimited number of add-on data elements (like you describe) and to redefine 20 (of 2000 or so) known data elements to be of whatever type suits the client business (it all gets stored as strings).  When one or more of these 20 elements are involved in type-specific operations, we use the attributes to determine the native type, copy the data to a scratch table with properly-typed columns and then do the work. 

A HUGE amount of effort has been expended to do this because it is central to our not-inexpensive product.  It gives our clients the flexibility they want but changes take very careful deliberation.
a former big-fiver Send private email
Thursday, April 13, 2006
EAV gives you extreme flexibility, but at the expense of extremely poor performance. It works only when you only have a very low load on the system for reporting or data change. Scalability is practically non-existent, and it's a bugger to support.
David Aldridge Send private email
Thursday, April 13, 2006
For me, the key insight that stopped me from doing 'things' this way was that the further I went, the closer and closer I got to having created my own relational database and further and further from something an end user could work with. I've got a relational database already and the end user has already hired me. 'nuff said.
Ron Porter Send private email
Thursday, April 13, 2006
Thanks to every one for your comments.

Points taken.

Its not for any commercial product, just something that I have used to store some data for myself, and my wife's business (contacts etc)

I've had a lot of fun with it though!

Brad Thomas Send private email
Thursday, April 13, 2006
Not to pile on, but I just wanted to second what seems to be the consensus.  It may work for you now, but if the project grows you'll be in a lot of pain down the road.  If you're going to use a database, leverage the functionality of a database (typed data, constraints, indexes).  Otherwise, you've just got a big ini file.
bmm6o Send private email
Sunday, April 16, 2006
==>For me, the key insight that stopped me from doing 'things' this way was that the further I went, the closer and closer I got to having created my own relational database and further and further from something an end user could work with.

Yeah -- the point I gave up on it was similar. My business partner referred to it as my " database within a database" -- and he wasn't referring to it that way in a good way, more like "This is stupid, dude, you're reinventing the database."

Been there, done that. Wished I didn't.
Monday, April 17, 2006
OMG - don't do it! Make a proper normalized database, please. You will end up on the Daily WTF otherwise...
Tuesday, April 18, 2006
The Jabber message passing angle is interesting. Have you thought of tying it together with p2p/BitTorrent distribution of data? That's a bit out there of course but it would be interesting!
Sunday, May 07, 2006

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics
Powered by FogBugz