The Design of Software (CLOSED)

A public forum for discussing the design of software, from the user interface to the code architecture. Now closed.

The "Design of Software" discussion group has been merged with the main Joel on Software discussion group.

The archives will remain online indefinitely.

Audit/Log Table question

I was having a design discussion with my team lead on how to store/track row level changes to a particular table in database and we both came up with two different approaches. To put things in perspective let me explain by a simple example.

Customer Table Structure:
customer_id,
first_name,
last_name,
member_since,
total_purchases,
purchase_frequency,
average_purchase_amount,
zip_code,
last_purchase_date,
.
.
.

My team lead wants an exact replica of the table plus an additional column for saving the time a row was modified.

I believe that only the customer_id,name of the column that changed, value and time of change should be recorded in the log table to avoid data duplication.

What are the pros and cons of each approach? Or is one approach better than the other?
Urumi Send private email
Saturday, April 21, 2007
 
 
It depends on what you are going to use the table for. The flatter one is going to be easier and faster to query for changes but takes up more space. It also has the advantage that you can probably make the timestamp happen automatically (regardless of what database you are using), meaning that the exact set of in-memory values for the columns can be saved to the audit table:

insertOrUpdate(realTable, values)
insert(auditTable, values)

The other method optimizes a bit more on space and allows you to see exactly what columns actually changed (without doing comparisons between audit records), but it's more complicated to implement - both from a writing perspective and a reading perspective.

One thing to consider is whether you expect row-level changes to make changes to most of the columns all at once or typically just one or two columns. Another thing to consider is how big your database will likely get and how cheap is disk space, compared to software development.

So ... personally, I would use the simplest thing that will solve your immediate requirements. My gut tells me the flatter way (whole row) is fine, but you know the requirements better than I.

-Patrick Foley, ISV Architect Evangelist, Microsoft Corporation
-Starting an ISV? Check out http://www.empowerforisv.com?CID=7C2Q98F66C16SX to get nearly $35,000 worth of software for $375/year.
Patrick Foley Send private email
Saturday, April 21, 2007
 
 
Another thing I've seen people do is save all changes in an XML blob, so you can generalize audit functionality across all tables. There are pros and cons to this as well, obviously.
Patrick Foley Send private email
Saturday, April 21, 2007
 
 
Sometime ago I worked with two systems in the same shop, one that logged just the value, user and datestamp and the other that saved the full row, the user and a datestamp.

With the full row audit one it was easy to see the history just by visual inspection of the results of the query. It was also very easy to rollback records to previous versions. The problem is that there was a 1 to 7 ratio between the data and the audit so the 1 Gb of data lived in an 8Gb database. Queries were slow-ish.

The single value audit was fast to query and did not add significantly to the database size. It was much more difficult to compare records though, and rollbacks were just too risky.
JSD Send private email
Saturday, April 21, 2007
 
 
The way we're doing it is to store every change to every table in an audit table this the structure:

DateTime
WindowsUsername
SystemUsername
ComputerName
Action
Info

The Action field stores whether this is an add, a delete or a modify. The info field contains XML, with one of the following:

If Action is "add", it stores the entire record as it was originally entered.

If Action is "modify", it stores the fields that changed, the old value and the new value.

If Action is "delete", it stored the entire record as it was before it was deleted.

This allows us to do some cool things, including doing a list of every field that changed in the previous edits (for example, John Doe edited this record on 1/1/2007; Last Name changed from Smith to Jones; Phone Number changed from 555-1212 to 555-1213, that sort of thing).

It also allows us to do really rapid searching and filtering on who did what and when. All the info could, of course, be stored in XML, but that would slow down queries  of the data.

This table can, of course, get hella big, so you'll probably want to put in things to delete old data, or archive it, or store the XML in a compressed format, or something to that effect.
More or less anonymous Send private email
Saturday, April 21, 2007
 
 
I forgot the field "TableName". :-)
More or less anonymous Send private email
Saturday, April 21, 2007
 
 
Agreed with Patrick Foley.

Your Team Lead Way:
Pro:
* Easy to understand and maintain - even when column name changes
* Easy to implement => fewer bugs
* Efficient and Easy to query the "change" log

Con:  Waste of Space.

Here is a checklist when I had similiar problem:

* How much work is required to modify a Table/Constraints? (Add/Remove/Change Columns)
* Is it EASY to query the change history?
* What do you do with the change history? (Is it just for auditing purpose of there is something more?)
* How do you deal with depending tables?

PS: Clarity before Maintainability before Optimisation

Saturday, April 21, 2007
 
 
My approach to audit logging is documented at http://www.tonymarston.net/php-mysql/auditlog.html

I have an AUDIT database which contains just four tables - SESSION, TRANSACTION, TABLE and FIELD. The FIELD table stores both the 'before' and 'after' image of each field that actually had its value changed. This design allows me to record changes to any database table very easily as the audit table does not have to mirror the structure of the 'live' table.

Because all the audit data is stored on just 4 tables it is also very easy to have just one set of online forms which allow the contents of the audit database to be interrogated using a wide variety of search criteria. This is very useful in unit testing as I can quickly discover what database updates I have just performed so I can tell whether they were correct or not.

Writing to the audit database is performed within my Data Access Object, not by database triggers, so it is very easy to turn logging ON or OFF for any table. There is also a simple mechanism will allows me to turn logging OFF for specific fields.

I have implemented this design in two different frameworks in two different languages, so I know how well it works.
Tony Marston Send private email
Sunday, April 22, 2007
 
 
Chris Winters Send private email
Monday, April 23, 2007
 
 
Your team lead is right.

It's not a waste of space.  It's valuable data which tracks the entire change and allows you to determine state at any given point in time.  Create the audit logs in a separate tablespace and consider archiving/deleting records on a reoccurring basis.
. Send private email
Tuesday, April 24, 2007
 
 

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics
 
Powered by FogBugz