The Design of Software (CLOSED)

A public forum for discussing the design of software, from the user interface to the code architecture. Now closed.

The "Design of Software" discussion group has been merged with the main Joel on Software discussion group.

The archives will remain online indefinitely.

save file management - backwards compatibility

As my app grows, the "save file" needs to include more and more stuff, so I add more fields to it. Putting a version number in the save file is not enough of a solution when the user skips upgrades. So if their save file is version 1.1 and they skipped to installing the app using version 1.2 straight to the app using version 1.3, whatever got added in version 1.2 will be missing.

The only way I can think of is to have the app know about all the versions of the save file and upgrade it sequentially from whatever version it is to the latest version step by step.

How do others deal with this?
rookie_coder
Wednesday, April 30, 2008
 
 
"How do others deal with this?"

Use XML.

But I really don't understand the problem.  The version 1.3 reader should just have code to read 1.1 documents and 1.2 documents.  Assuming your just adding fields, that isn't a difficult change.

If you radically alter the save format, just include the previous versions reader in your program and add new reader code for the new format.
Almost H. Anonymous Send private email
Wednesday, April 30, 2008
 
 
A lot depends on what you're saving.

If you're deserializing data into an object model, you can conceivably have a deserializer that works on 1.0 streams, one that works on 1.1 streams, and so on.  That can work, if you have a solid understanding of serialization, and if all of the new properties you implement have defined defaults if they don't get initialized during deserialization.  In that case, you can read any version of the input stream (though when you serialize, you'll serialize to the current version).

But there's a lot of risk.  What if you've forgotten to define a default for a property, and (since you've forgotten about it) you don't remember to test for it?  Worse, what if you start refactoring code and you start getting rid of "unnecessary" default assignments, because you don't remember why they're there?  Then you have a bug in your program that only occurs if you opened a file from a previous rev.  Bet your users find it before you do.

It's a lot more straightforward to build methods for 1.0 -> 1.1, 1.1 -> 1.2, and so on, and just chain them together when you read your input.
Robert Rossney Send private email
Thursday, May 01, 2008
 
 
As I understand it, Ruby on Rails gets you to define "roll up" and "roll down" functions, so you can define the migration path, chain it together, save files as an old version (losing a bit of formatting). But yes, it's a pain.
PeterR
Thursday, May 01, 2008
 
 
Serialization may be the obvious choice for this, but I was thinking of using sqlite database.
rookie_coder
Thursday, May 01, 2008
 
 
"As I understand it, Ruby on Rails gets you to define 'roll up' and 'roll down' functions, so you can define the migration path, chain it together, save files as an old version (losing a bit of formatting). But yes, it's a pain."

The feature you're talking about (ActiveRecord Migrations) is nothing to do with saving files. It's a feature that lets you evolve or roll-back a database schema.
John Topley Send private email
Thursday, May 01, 2008
 
 
> The feature you're talking about (ActiveRecord Migrations)
> is nothing to do with saving files. It's a feature that
> lets you evolve or roll-back a database schema.

That would work nicely with my sqlite database scheme actually, but I am not developing a web application.
cynic
Thursday, May 01, 2008
 
 
My serialization functions do:

version << archive
field1 << archive
field2 << archive

if(version >= 1.1)
  field3 < archive
else
  field3 = some default

//for removing an old field
if(version < 1.3)
  tmp < archive //field no longer exists in 1.3

With this all forward versions can always read all previous versions.
Doug
Thursday, May 01, 2008
 
 
Doug,

Wouldn't you get screwed if the field types changed (for example from int to double)?

I guess you can just keep piling on new fields with the desired types instead of editing or removing the previous ones, but that would cause the file to bloat in the long run probably.
rookie_coder
Thursday, May 01, 2008
 
 
One viable solution is to use XML. Utlilize an attribute to signal the version. For example:
<person version="1.0">
 <name>...</name>
</person>

If in the version 2.0 this structure is expanded, change the version attribute to "2.0".

In this way you can have your application remain backforward and (possibly) forward compatibility (version 1.o app reads 3.0 file format).
Glitch
Thursday, May 01, 2008
 
 
One other thing is to differentiate "official" file versions from "unofficial."  In products I've worked on, we usually have some amount of futzing with the format and have lots of code of the form mentioned by Doug.  So we might have our internal versions 1.1, 1.2, 1.3, 1.4, etc.  But then at some point, we release the code, and that file format becomes 2.0.  The stuff we used internally during the development cycle isn't client-visible, so we can clean it up.

Our released product handles 1.0 and 2.0 fine, but none of the internal versions.
Michael Gibson Send private email
Thursday, May 01, 2008
 
 
"Wouldn't you get screwed if the field types changed (for example from int to double)?"

Allow your code to read any version of the database.  But for writing, if the database is an older version create a new file and repopulate with the data.
Almost H. Anonymous Send private email
Thursday, May 01, 2008
 
 
rookie_coder:

version << archive

if(version < 1.1)
  tmpInt < archive //archive contains an int
  field3 = (double)tmpInt
else
  field3 = archive //archive contains a double

Anonymous H is right.  Saves only happen using the current version, which means old fields are never saved out, and the file never becomes bloated.  That means old software can't read stuff saved by newer versions, but maintaining forwards compatibility is VERY expensive, and depending on the changes impossible.
Doug
Thursday, May 01, 2008
 
 
Two people have suggested using XML.  As Wolfgang Pauli famously said, "That isn't right.  That isn't even wrong."

The problem under discussion here has nothing to do with the format of the persisted data.  Any serialization format (or database) can store metainformation about its version.  The idea that XML somehow simplifies this problem because it's easy to add version-number element is exactly the kind of thing that fuels the XML haters' fire, and as a pretty big XML fan myself I don't like to see that happening.

If you're using a database to store the data, it's not really practical to build a method to upgrade the database from version x to x+n where n > 1, unless you generate the upgrade script programmatically, and that is a hard problem.  It's generally only practical to build a script for x to x + 1, and run the scripts successively when you encounter a database that's more than one rev old.
Robert Rossney Send private email
Thursday, May 01, 2008
 
 
"The idea that XML somehow simplifies this problem because it's easy to add version-number element"

Not because it's easy to add a version-number element, but because it's a flexible container format.  When the next version adds new fields, you just add another element.  The entire thing is parsed by a library parser so there is no need to write your own file parser.  It's a good use of XML depending on the job at hand.

The OP did eventually mention that he wants to use SQLite, which is also a fine choice depending on the situation.
Almost H. Anonymous Send private email
Thursday, May 01, 2008
 
 
> Saves only happen using the current version,

Ah I get it now. Assuming that the version number type never changes and it at a fixed location (the very first spot), then reading any file format should be possible. (If you change these though, then you are screwed).

Robert, I agree and therefore I didn't comment on the specific implementation. I think I prefer the database approach for other reasons I didn't talk about here, but XML or any other encoding would probably work just as well.
rookie_coder
Thursday, May 01, 2008
 
 
Here's usually what I do with versioning.

class Foo {
    // Change this everytime you have a new version.
    enum { CLASS_VERSION = 3 }

save(ar) {
    ar << CLASS_VERSION;
    // Change this everytime you have a new version.
    save3();
}

save1(ar) {
    ar << blah1
}

save2(ar) {
    ar << blah1
    ar << blah2
}

save3(ar) {
    ar << blah1
    ar << blah2
    ar << blah3
}

load(ar) {
    int version;
    ar >> version;

    switch(version) {
        case 1:
            load1(ar);  break;
        case 2:
            load2(ar);  break;
        case 3:
            load3(ar);  break;
        default:
            throw InvalidVersion();
    }
}

load1(ar) {
    ar >> blah1;
}

load2(ar) {
    ar >> blah1;
    ar >> blah2;
}

load3(ar) {
    ar >> blah1;
    ar >> blah2;
    ar >> blah3;
}

}

Do that for every serializable class and you have guaranteed backward compatibility with every version.  You do want to keep the old code for save1() and save2() so that you can keep track of what were being saved at each version.
been there done that
Friday, May 02, 2008
 
 
> How do others deal with this?

Load the file:

  version = read in the version number from the file

  if (version >= 1.1)
  {
    //-- read in fields for version 1.1
  }

  if (version >= 1.2)
  {
    //-- read in fields for version 1.2
  }

  if (version >= 1.3)
  {
    //-- read in fields for version 1.3
  }

Save the file:
  .. write our current version which is 1.3
  .. write out 1.1 fields
  .. write out 1.2 fields
  .. write out 1.3 fields
Jussi Jumppanen Send private email
Monday, May 05, 2008
 
 
Thank you everyone!
rookie_coder
Tuesday, May 06, 2008
 
 

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics
 
Powered by FogBugz