The Design of Software (CLOSED)

A public forum for discussing the design of software, from the user interface to the code architecture. Now closed.

The "Design of Software" discussion group has been merged with the main Joel on Software discussion group.

The archives will remain online indefinitely.

Should objects save themselves or be saved by others?

I just finished a code review with a friend of mine and found out we both wrote some functionally very similar code, but it was very different when it came to saving data.  So now we have to very different frameworks for saving data and we're trying to decide which one to keep going forward.

I won't say which, but one of us designed out classes so that the DataStorage object saved each type of object.

    DataStore.SaveObjectOne( Object );
    DataStore.SaveObjectTwo( Object );

And the other person wrote things so objects wrote themselves to a provided data store...

    ObjectOne.Save( DataStore );
    ObjectTwo.Save( DataStore );


Both save methods use SQL to save data.  We haven't started using hibernate or similar packages yet.  We may consider using ORM in the future.

So I have two questions for all of you...

1 - Which method do you use or prefer? 
2 - Which method will adopt more easily to ORM?
Russ Ryba Send private email
Friday, June 22, 2007
 
 
I haven't taken the ORM jump yet either, but I prefer a combination of the two methods. Instead of passing the DataStore into the Save method, just pass it into the contructor (or use an Inversion of Control container to pass it to the object). This enforces ecapsulation while keeping the class decoupled from its data source and it's not very often that an object's data source would change during execution to require passing the data source on each Save.

public class MyClass
{
  private ClassRepository _repository;

  public MyClass(ClassRepository repository)
  {
    _repository = repository;
  }

  public void Save()
  {
  _repository.Save(this);
  }
}

public class ClassRepository()
{
  public void Save(MyClass object)
  {
    //..save it
  }
}

...

MyClass ObjectOne = new MyClass(new ClassRepository);
MyClass ObjectTwo = new MyClass(new ClassRepository);

ObjectOne.Save();
ObjectTwo.Save();
ScottC Send private email
Saturday, June 23, 2007
 
 
I used the "object saves itself" variant in a CS project. I liked it because it was a fun trick and it was possible to create objects that saved themselves from a constructor so you could have code like:

class myObject {
 ...
}

new myObject x;  // x instantiates & saves self to datastore

I think a few people just threw up though! I changed it to something a bit more conventional but it was fun while it lasted.
Cymen
Saturday, June 23, 2007
 
 
I think both options are valid.

One aspect to consider is code maintainability. If the DataStore class is more likely to change than the classes you need to save, then I would consider the 1st option, since you would only need to make changes in one place. However, if the classes you need to save are more likely to change than the DataStore class, I would contemplate the 2nd option, following the same reasoning.
Toby Send private email
Saturday, June 23, 2007
 
 
This question usually turns into a religious argument.  Do whatever works best for your application's architecture, within whatever shop-driven constraints you have.
Eric Knipp Send private email
Saturday, June 23, 2007
 
 
I prefer to be saved by another object to keep all the SQL ORM stuff out of my "real" object. It's just cleaner that way.
son of parnas
Saturday, June 23, 2007
 
 
If you are writing application code, then the object should probably save itselft. If you are writing framework/API code, then the object should be saved by the framework.

In application code the range of contexts in which the object and the data stores must interact is very limited, so it doesn't hurt anything to tightly couple to two. What's more, coupling is, almost always, a shortcut, and you will get the application done sooner, with less effort, if you allow close coupling.

In a framework or API, however, you have little or no idea what contexts will be at work between the objects (both framework/API objects and client objects) and the data stores (yes, plural) so you can't afford to couple the two very closely: you need the framework/API to be flexible for both platform changes and client needs.

Writing applications is merely hard. Writing good frameworks/APIs is excruciatingly difficult.
Jeffrey Dutky Send private email
Saturday, June 23, 2007
 
 
As others sais either way is correct. It really is a matter of how you structure the application. If your objects travel all the way from the data access layer to the UI then it probably makes sense to separate the database code from the object.

I use an ORM tool that gives me the two options. They call it Self-service and Adapter. If you are interested download their docs:

http://www.llblgen.com/pages/documentation.aspx
JSD Send private email
Saturday, June 23, 2007
 
 
The object should provide the data buffer or XML string representation of itself, the data store should save it.
onanon
Saturday, June 23, 2007
 
 
Objects should save themselves.

However, that doesn't mean that you need to put, say, SQL code into your class definitions. You can still have seperate SQL query builder objects and helper tools that handle all the aspects of serializing the data into useful forms that get called by Object.Save().
Robby Slaughter Send private email
Saturday, June 23, 2007
 
 
+1 to Robby.

Well I don't have a particular opinion one way or the other because different circumstances require different solutions.  Having objects save themselves is an easier API to use but less flexible with regards to supporting different storage types.

I know all my objects will always be stored to a SQL database so I just have them save themselves.  However, very little SQL logic is actually stored in class -- the classes get much of that functionality from the inheritance hierarchy.  If I needed to swap out my storage for something else (like XML) it would be easy to do.  But I couldn't easily support two storage types at a time.
Almost H. Anonymous Send private email
Saturday, June 23, 2007
 
 
It depends on your architecture, degree of object coupling, shop practices, developer skill level, and level of complexity needed.  There's no one "right" answer.

On one end of the spectrum is a CRUD style app with few inter-object dependencies.  IMO, this is a good case to be using the "object saves itself" style.

On the other end of the spectrum is a distributed system with lots of inter-object dependencies.  IMO, this is a good case for having a data storage object.
xampl Send private email
Sunday, June 24, 2007
 
 
There's no one right answer, of course.

The "store saves the object" style is referred to these days as "persistence ignorance". There's some good discussions in Evan's "Domain Driven Design" book[0] on the topic.

Personally, I prefer to not have save methods on my objects - this comes back to the single responsibility principle. I've got enough to worry about getting the object to do it's own job without adding boilerplate persistence code on top.

-Chris

[0] http://www.amazon.com/Domain-Driven-Design-Tackling-Complexity-Software/dp/0321125215
Chris Tavares Send private email
Sunday, June 24, 2007
 
 
You're missing some very important data in the question. How many types of DataStore and objects are there?

Maybe you only have one type of object that is capable of saving itself to FTP, MS SQl Server, Oracle, WebService or multiple different file types (binary, CSV, XML).

Alternatively you might have many object types that all save to the same DataStore.

I'm not offering a suggestion, just that in the real world there are many more parameters that effect the best decision.
Adrian
Monday, June 25, 2007
 
 
Asm any has already said, it depends on the context. But been a little bit more precise on this, I think that the main question is: how likely is that you change your data storage? In other words, if you foresee importantn changes in the data storage, then it would be better to delegate the save method to an external object.

Still this being the case, I would make objects "Savable", that is, require them to expose their internal state to be saved, because this is going to be tricky for any non trivial object (consider composite objects). Another
non trivial problem is to handle the references to other objects which could be saved/restored independently.

Finally, consider also that is you are saving object, then you need to restore them from the data storage at some point and this will affect the way you contruct them and you will require some factories to build them (for composite objects, you will require a sort of Builder).

I used once the idea that some one sugested of exposing the object's state as XML and it worked well (with a little bit of performance penalty) because we could changed the data storage design COMPLETELY before finishing the project with almost no impact to the business objects (we migrated from a Xml db to a rdbs due to poor performance).


Pablo
Pablo Chacin Send private email
Monday, June 25, 2007
 
 
> Personally, I prefer to not have save methods on my objects - this comes back to the single responsibility principle.

Good point. For the people wanting the object to save itself, does the object also create itself? If not, does it make sense to have it save itself?
son of parnas
Monday, June 25, 2007
 
 
My objects have both a save() and load() methods.  You create the object, and then load it's state (or not, if it's completely new).
Almost H. Anonymous Send private email
Monday, June 25, 2007
 
 
Thanks for the excellent replies everyone.  This mirrors a lot of the conversations we're already having. 

Sorry for asking the question without the full context our goals.  The main thing we're trying to do is use the same objects on both the desktop and CE devices.  So in this case the same object has to know about at least two storage methods - or be each storage method has to know about the objects.

We ideally would like to auto-generate the build for both desktop and handheld from the same source code base for the logic portion of the application.

We're planning on using SQL Server on the desktop and SQLCE on the handheld.  The features and syntax on both databases are just different enough we can't port the code directly from one to the other.

On the one hand, an object saving itself is good because it has access to all of it's internal / private data.  It knows what to save and so on.  The part that seems a little uncomfortable is coupling the object code to a particular database.  I've worked on larger projects before where we tried to port to oracle but found it incredibly hard because the system used to many T-SQL specific statements scattered through the codebase.

On the other hand, the databse saveObject methods will often need to get at the private members to save things correctly.  Then the data storage controller can save the object in whatever way it sees fit.

Right now I'm leaning towards data storage classes with saveObject methods and either read only or perhaps inherited classes that expose some of the internal values of the objects that we need to save and load.

Thanks to everyone for your comments.  Now that you know my goals does this change any of your suggestions?
Russ Ryba Send private email
Monday, June 25, 2007
 
 
"The part that seems a little uncomfortable is coupling the object code to a particular database."

Then don't.  You can still have the objects save themselves ( to get access to internal state) and abstract the database particulars, that's what I do.  I could switch to any relational database platform and not have to edit the entity class code.
Almost H. Anonymous Send private email
Monday, June 25, 2007
 
 
Even if you're never going to change the standard repository (DB), you might decide to add a remote caching layer like memcached. You'd want to do this in order to reduce the DB load by trimming away most reads.

It would seem to me that if you expected objects to save themselves, you could easily create a clash in the API idioms between normal application logic and caches. I guess you could have the object.save() and object.load() delegate first to a cache, but it gets messier over time. I just wouldn't want this approach because the coupling seems like an unnecessary burden.
Benjamin Manes Send private email
Tuesday, June 26, 2007
 
 
Oh, btw, one way around exposing internals instead of in the API is by reflection. This is how language-supported serialization handles it. In Java, the core APIs have security permission to access internal state while application libraries (by default) do not.
Benjamin Manes Send private email
Tuesday, June 26, 2007
 
 
I prefer to write my objects to save themselves.
I still code in Delphi
Tuesday, June 26, 2007
 
 
"I guess you could have the object.save() and object.load() delegate first to a cache, but it gets messier over time."

Object.save() and object.load() can delegate to an abstract API which can either hit the database, or the cache, or serialize to XML, etc.  And you can change it down the road.
Almost H. Anonymous Send private email
Tuesday, June 26, 2007
 
 
After learning more about the contextg, I'm still right. :-) Objects should save themselves.

Your complaint that the SQL is different enough to cause problems in your codebase doesn't really affect the basic model. You can still abstract away the data layer into a library or another object chain (or even something crazy like IFDEFs or your own last-minute linking/transcoding).

Objects are supposed to be intrinsic, independent instatiations. Any operation that occurs using only their own data, they should be able to handle.
Robby Slaughter Send private email
Tuesday, June 26, 2007
 
 
I'm currently working in a project which started out (by someone else) with the object-saves-itself approach and saving objects tightly tied to the construction of objects. I find this to be a major PITA when it comes to unit testing. Creating an object requires a DB connection, since the damn things want to store themselves immediately. Creating mock objects (Mock subclasses) for testing purposes becomes quite difficult. The objects are alway closely tied to a database connection or instance. Creating temporary objects (in comparators, etc.) is also a hassle, since everything immediately wants to save itself tome some JDBC storage.

Also, as others have already written, if we replace the database layer (e.g. in order to introduce a factory for test objects in order to run reproducible tests), we need to touch many classes.

So, I'm not a big fan of object-saves-itself. In my mind, an object should be able to just *be*, regardless of where it comes from. It should to its jobs, perhaps be able to represent itself in a standard way to the outside world.

Of course, a lot of the problems I'm describing stem from the original implementation decisions and could be overcome in another variant of the same approach. Somehow other project requirements are always higher-priority than my feeling that this implementation does not "smell" quite right so there's never really time for reworking this.
Daniel Send private email
Thursday, June 28, 2007
 
 
"Creating an object requires a DB connection, since the damn things want to store themselves immediately."

Yeah, that's a bit screwed up -- a very limiting implementation.

My objects do take a connection abstraction so operations on them can be mocked during unit testing.  Admittedly, I should actually have more of abstraction between the object and it's storage and after reading this thread I'm thinking of implementing that during the next refactor.
Almost H. Anonymous Send private email
Thursday, June 28, 2007
 
 
I think the important thing here is that the objects represent logic, not implementation. Saving an object is an implementation feature, not a logical feature. If I have an application that uses a bunch of objects to represent business data and various operations on that data, I should be able to grab those objects and use them in another application without worrying about recoding the implementation of the object.

An example that I ran into was an on-demand web application. All the business objects had been written to save themselves by executing stored procedures specific to their object. Then I was tasked with writing an "offline" client, that was to be installed on customers' machines, without a database. I would have loved to use those objects we had, with all their business logic, but I couldn't. I had to keep copies of the objects and strip their save functions so that it would work with an XML data store. The worst was when a bug was fixed in one object, it had to be fixed in the copies and vice versa.
still trudging
Tuesday, July 10, 2007
 
 

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics
 
Powered by FogBugz