The Design of Software (CLOSED)

A public forum for discussing the design of software, from the user interface to the code architecture. Now closed.

The "Design of Software" discussion group has been merged with the main Joel on Software discussion group.

The archives will remain online indefinitely.

An O/R mapping question

I've been reading around (as linked to by a poster here) and playing around with a few toy apps. One thing that has come up and that I can't figure out a good answer is the question of how we know when the object mapped from a particular record is already loaded.

Suppose we have Employees and Departments in the DB, mapped to Employee and Department classes. We start off looking at employee A - we create the object, read its details from the DB, deferring getting the department details until we need them. Later, we need the department details, so we create a Department object for department X, read its details from the DB, and point to it from the Employee object.

That's all fine.

But now we look at employee B - create object, read details from the DB, go to access B's department - which is the same department as A's, so we already have in memory a Department object representing this department. We surely don't want to create another object representing the same entity. But how are we to know that A already has  a reference to department X?

The only system I can think of is that the Department class itself manages all the Department objects with static properties and methods: we would ask it "I want department X. Either load it and give me it, or just give me the reference you've already got." Internally it would keep a map of <identifying attribute> -> Department.

But then how would Department objects ever get garbage collected? Even once there were no live Employee objects referencing them, the class's reference would keep them live. Unless the class's references were weak references? But weak references are usually a sign your design is wrong, in my experience.

Any insight will be gratefully received!
Larry Lard Send private email
Thursday, May 11, 2006
Google for identity maps.

Allan Wind Send private email
Thursday, May 11, 2006
The best places to start are Martin Fowler's book and site:
Berislav Lopac Send private email
Thursday, May 11, 2006
I think all the extra complexity you would be adding is not worth it just to save a few bytes in memory and an extra call to the db.
Thursday, May 11, 2006
The db4o project ( ) is an object database that automatically handles exactly the type of reference resolution that you're talking about. When the database engine retrieves objects from the object-store, it never brings back multiple instances of the same object (unless the database contains multiple equivalent objects).

It uses weak references to maintain references in the object graph, so those references don't prevent the objects from being garbage collected.

I haven't tried it out myself, but it looks intriguing, and the benchmarks are good (better than Hibernate/MySQL).
BenjiSmith Send private email
Thursday, May 11, 2006
Identity map, as others have said.  Basically it's a repository of all your retrieved objects.  It's in charge of returning the object to you.  If it doesn't have the object, it retrieves it, keeps a reference, and then returns it to you.  Then next time it'll already have it.

db4o is pretty cool, though.  You should take a look at it.
Kyralessa Send private email
Thursday, May 11, 2006
Thanks folks. It sounds like I was groping towards the identity map idea (which makes me feel clever!)

So I'm right in thinking the identity map has to keep *weak* references? That Martin Fowler page just says "an Identity Map keeps a record of all objects", but I think we have to use weark refs to allow the object to be GC'd once all 'real' users have finished with it.

DJ: It's not about saving memory; it's about making sure there aren't two different objects representing the same logical entity

Benji and Kyralessa: I will put db40 on my interesting things queue, thanks.
Larry Lard Send private email
Friday, May 12, 2006
I wouldn't even worry about the extra reference... with a scripting environment (PHP for example) objects are being created / destroyed in every request. It's a non-issue on a modern machine.
Joe Send private email
Wednesday, May 17, 2006
On a postmodern machine, though, watch out!!

Those things are completely unpredictable.
BenjiSmith Send private email
Wednesday, May 24, 2006

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics
Powered by FogBugz