The Design of Software (CLOSED)

A public forum for discussing the design of software, from the user interface to the code architecture. Now closed.

The "Design of Software" discussion group has been merged with the main Joel on Software discussion group.

The archives will remain online indefinitely.

How to express a multi-row constraint in the DB?

I'm trying to model a kind of "continuity equation" in SQL.

There are a number of cells, each with a number of input and output flows. The constraint is that at each cell, the sum of all inputs must equal the sum of all outputs.

I'm modeling this in two tables.

 -- other stuff

 cell_input integer REFERENCES cells (id),
 cell_output integer REFERENCES cells (id),
 flow numeric(10,5),
 -- other stuff

One of the foreign keys cell_input, cell_output will be NULL, and I have a constraint to ensure that every flow is either an input or an output. I can have any number of input and output flows for each cell. It's easy to calculate the total input and output for a cell with a JOIN.

Now I want to ensure that the database is always in a state consistent with the continuity equation.

I tried a table constraint on "flows" to express the continuity equation:
 cell_inputs (coalesce(cell_input,cell_output)) =
 cell_outputs (coalesce(cell_input,cell_output))

...where "cell_inputs()" and "cell_outputs()" are functions which sum up the flows for the given cell id by querying the "flows" table.

But this constraint doesn't work right. In fact, much to my surprise it even ensures that the "flows" table will NEVER be consistent!

That's because apparently the constraint is checked for each inserted row separately. The idea was to use multi-row inserts which should be validated against the constraint. But what really happens is that for the first inserted row the constraint is fulfilled, because the flows were balanced BEFORE the row insert. AFTER the first row has been inserted, then the check fails for all following insert rows, because now the first row is present in the table, putting the flows off balance.

How can I express a constraint like this on DB level?

- kisch
kisch Send private email
Monday, October 22, 2007
From a mile-high viewpoint it sounds like you need to use triggers to accomplish this. Whether that will work depends on your platform...
Monday, October 22, 2007
+1 on the triggers. I usually implement these kinds of squirrelly constraints from within triggers but be advised: if you're doing individual INSERTs, you'll run up against the same issue: The trigger will be fired for each INSERT statement.

You might want to initially do the individual INSERTs into some kind of staging table and then do a *single* INSERT into the true target table so that it's fired only once for that single INSERT statement. That's sort of how I imagine I'd go about it.
Monday, October 22, 2007
We do sort of what the Sarge suggests. 

Where we have these type of dependencies, our triggers use flags to tell them if a particular bit of code should evaluate (I'm not a DBA; I think we use session temp tables for the flags).  On the first row in a batch, the flag(s) get flipped off; on the last row they're flipped back on before the action starts. 

We use this technique in a number of specific places in our app.
a former big-fiver Send private email
Monday, October 22, 2007
Many thanks for the triggers ;-)

Here's another idea:
If constraints and triggers act on single rows only, then obviously all information which must be validated together has to go into a single row.

In the example case you would fold the "flows" table into the "cells" table. "cells" would get additional columns for "inputs" and "outputs", which each hold an array (yuck!) of flows.
Then a constraint check is simple.

Of course, this layout would also completely denormalise the DB.

Normally I would solve this kind of problem by keeping a set of normalised tables, hide them from direct access, and introduce a view to present the de-normalised column set. A pity that you can't have constraints on a view ;-)

Well, maybe this is a case where you could argue against a normalised design?

- kisch
kisch Send private email
Tuesday, October 23, 2007
kisch "Well, maybe this is a case where you could argue against a normalised design?"

Why not argue in favour of better constraint definition in the DBMS?


Gene Wirchenko
Gene Wirchenko Send private email
Tuesday, October 23, 2007
"Why not argue in favour of better constraint definition in the DBMS?"

I'm all for it!
But does any DBMS listen?

In fact, to have constraints on views would be sufficient, wouldn't it?

- kisch
kisch Send private email
Wednesday, October 24, 2007
Triggers don't act on a single row (this is MS SQL Server I'm speaking of), but they are 'fired' once per batch. They are basically a stored procedure that is automatically triggered by a specified condition.

You can check any number of rows for a combination of values to act upon.

People get confused and think that a trigger is fired once per row (iserted, updated, deteled), instead of once per batch.
Wednesday, October 24, 2007
==>but they are 'fired' once per batch.

umm ...


They're fired once per *statement*. If I batch up three INSERT statements, I get the insert trigger fired 3 times. You are correct, it ain't once per row ... but it ain't once per batch either. A "batch" is defined in MS SQL as all statements "batched up" between GO statements. A 3 statement "batch" looks like this:

INSERT yadda ...
INSERT yadda-yadda ...
INSERT yadda-yadda-yadda ...


That's a three statement "batch" (See the GO statement). The INSERT Trigger will be fired 3 times for this batch (again, regardless of how many rows are inserted).
Wednesday, October 24, 2007
Sorry, I misspoke when I typed that originally :-)

What I meant to say is that they fire once per statement. In your example, the trigger would fire for each insert. The situation I was trying to warn about is more for Updates and Deletes. In those cases, the trigger won't fire once per row, just once per statement.

A lot of people write triggers as if they are going to process each row in an update or delete statement, instead of just once.

(in the case of SQL Server).
Thursday, October 25, 2007
ok... stupid question here.  Why not put the logic in the middle tier (or the loader), and stop it from getting to the database in the first place?
Another Anonymous Coward
Friday, October 26, 2007
In my oppinion, when you consider to lift constraints from the lowest possible layer, i.e. the DB tables in this case, you could argue as well to implement them in the user manual instead (as the topmost application layer) ;-)

You can't bypass a DB constraint on a table, but you can easily have bugs in an application; you can even have more than one application for the same DB data, so that you would have to duplicate the constraint-code in an upper layer.

Also, the DB syntax for constraints is nicely declarative, which prevents bugs. (Well, obviously I managed to implement a bug nevertheless.)
kisch Send private email
Monday, November 05, 2007

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics
Powered by FogBugz