The Design of Software (CLOSED)

A public forum for discussing the design of software, from the user interface to the code architecture. Now closed.

The "Design of Software" discussion group has been merged with the main Joel on Software discussion group.

The archives will remain online indefinitely.

Vietnam of Computer Science

I just finished viewing an interesting screen cast on Object/Relational mapping.  The author (Ted Neward) calls it the "Vietnam of Computer Science", a bit of an eye catching title, but the content was really worthwhile.

http://go.techtarget.com/r/1470277/5379010

It is a little long IMO, but totally worthwhile if you follow the ORM debates.

FYI
OneMist8k
Wednesday, May 23, 2007
 
 
Thanks for the link.

Here's a question that's prompted by my not understanding one of the 'problems' he mentioned.

* In OO, I think that every object/instance has an "identity" (e.g. its location in memory).

* In a relational database, a record may have (or usually has) an identity, i.e. a key field with a uniqueness constraint which can identify the record; but some don't (some tables aren't defined with unique keys).

In an ORM system, does the schema define an identity field for every table/type?

* If so, then it would seem that it might be easy to maintain a map between database records and instantiated objects, and so I don't understand his concern that if an SQL SELECT statement (especially one that's written by the user instead of being generated by the ORM) is complex then you don't know which records it returns and can't find any corresponding objects that might have already be loaded/cached (which results in 'duplicated' objects which is of course problematic).

* If not, then why not?
Christopher Wells Send private email
Thursday, May 24, 2007
 
 
Object identity is certainly possible and maps quite well to Relational identity, you can cache objects based on primary key easily for quick access with a few Dictionaries

Dictionary<Type, Dictionary<object, object>>

This way if you "Selected" the Product with id 42 the ORM system could look to the cache for the object first and return that if it exists (increasing performance and maintaining identity).

The trouble comes (and this is what I believe the article is talking about) is when you "Select" all the Products that begin with the letter "A", worst case is that you have 99.9% of the products already stored in your local cache but unless you go to the database you cannot tell whether you have them all. This can lead a chunk of wasted data coming back which is discarded in favor of the object you already have (to preserve object identity).

You could cache which queries have been already selected so that if they are accessed again you already know you have the results (excluding changes made since the first query) or try to exclude objects in the cache that meet the query from being fetched from that database.
Nigel Sampson Send private email
Thursday, May 24, 2007
 
 
> and this is what I believe the article is talking about

I thought that the problem he mentioned was a problem of integrity (correctness) resulting from there being more than one instance of the same object, not simply a performance (caching) problem. That's why I was puzzled: I would have guessed that ORM schemas do (must) define a database identity for instances of every type of object.
Christopher Wells Send private email
Thursday, May 24, 2007
 
 
@Christopher Wells

The mapping you're suggesting is equivalent to mapping classes onto relational views. But that works only for trivial / flat object models because inheritance or has by containment make the class->view mapping impossible.

Without inheritance or has by containment an object model is equivalent to a relational model.

This is what Ted is saying: you either downsize your object model to relational or you need some complex code to implement the ORM (which is what ORM layers do).

His point is that, in some cases, object database are an alternative to all that ORM complexity.
Dino Send private email
Thursday, May 24, 2007
 
 
As an alternative to both ODBs and ORMs I extensively use DTOs. I try to make sure that my DTOs map well onto relational (no use of inheritance nor has by containment). The I aggregate the DTOs within objects in my object model.


For example:

class PersonDTO {
  String firstName;
  String lastName;
  Date birthDate;
  Gender gender;
}

class Person {
  private PersonDTO data;

  public Person(PersonDTO data) {
    this.data = data;
  }

  public Person [] getChildren() {
    PersonDTO [] children = PersonDal.findChildren(data.firstName, data.lastName);
    return dtoToObject(children);
  }

  public Person [] getParents() {
    PersonDTO [] parents = PersonDal.findParents(data.firstName, data.lastName);
    return dtoToObject(parents);
  }

  private Person [] dtoToObject(PersonDTO [] dtos) {
    Person [] result = new dtos.length;
    for(int i = 0; i < length; i++) {
      result = new Person(dtos[i]);
    }
  }
}
Dino Send private email
Thursday, May 24, 2007
 
 
@Dino

Do you code this by hand, or do you use a generator that creates it from the schema?
OneMist8k
Thursday, May 24, 2007
 
 
@Christopher Wells

I don't have all your answers but I'll comment on a few:

"* In a relational database, a record may have (or usually has) an identity, ...but some don't (some tables aren't defined with unique keys)."

In relational theory, every record is supposed to have a unique identifying attribute.  It may be synthetic or natural key, but it is supposed to have one.  That isn't always true though, depending on the DBMS and the physical data model.

"...it would seem that it might be easy to maintain a map between database records and instantiated objects, and so I don't understand his concern that if an SQL SELECT statement (especially one that's written by the user instead of being generated by the ORM) is complex then you don't know which records it returns and can't find any corresponding objects that might have already be loaded/cached..."

One of the problems is with multiple consumers of the data (concurrency).  For example, if a Crystal Reports SQL gets the data directly from the DBMS, it might not know the data is already instantiated in an object cache.  If some of the attributes have changed on the memory resident object, then Crystal has stale data.  That is a concurrency problem. 

Another problem he touches on is the fragility of the object cache.  He makes reference to how they are prone to leaks "at the end of the day".  There is also the issue of loss of data when the server crashes, which is a problem that RDBMS's have solved years ago.
OneMist8k
Thursday, May 24, 2007
 
 
@OneMist8k

By hand. There's not much to automate. For example the
dtoToObject method can be generic if one uses reflection (invoke the Person(DTO) constructor).

public class Utils {

  public static dtoToObject(DataTransferObject dto, Class clazz)
      throws NoSuchMethodException,
              SecurityException,
              InstantiationException,
              IllegalAccessException,
              IllegalArgumentException,
              InvocationTargetException {
      Constructor cstr = clazz.getConstructor(dto.getClass());
      return cstr.newInstance(dto);
  }
}

But I prefer to hand code event he dtoToObject because it's faster in execution and compile time type checked.

Anyway the code is quite maintainable and easy to understand. Plus DTOs can be passed around and even shared between different objects. There are even more benefits: DTO's are easy to send accross networks.
Dino Send private email
Thursday, May 24, 2007
 
 
Bah, the video in the original link requires Real Player, which has been banned here.
Meghraj Reddy
Friday, May 25, 2007
 
 
Well, when you have a really good hammer-and-nail solution (a relational database, for managing mass quantities of data) it can be very awkward to insist we've got this GREAT screwdriver -- now, how can we make the hammer-and-nail solution (relational database) look much more like a screwdriver-and-screw (object oriented) solution?

I mean, the screwdriver-and-screw solution is newer.  It's more complicated.  It's a "higher conceptual level".  It MUST be better for everything, right?

Well, no, not necessarily.  Sometimes, "different tools for different purposes" really IS an OK thing to do.
AllanL5
Tuesday, May 29, 2007
 
 
"I mean, the screwdriver-and-screw solution is newer.  It's more complicated.  It's a "higher conceptual level".  It MUST be better for everything, right?"

Acutally OO (1960's) pre-dates the relational model (1970).
Hans Anderson
Friday, June 08, 2007
 
 

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics
 
Powered by FogBugz