The Design of Software (CLOSED)

A public forum for discussing the design of software, from the user interface to the code architecture. Now closed.

The "Design of Software" discussion group has been merged with the main Joel on Software discussion group.

The archives will remain online indefinitely.

externalize/internalize tree structure

What's the best way to externalize and internalize a tree structure?
Monday, January 23, 2006
Do you mean serialization (aka data marshalling)? If so, it depends on what language you are talking about, and how the tree is stored.

Monday, January 23, 2006
Yes I mean serialization as in I want to load/save the data stored in a tree structure. This is in C++. The tree is nothing special. A bunch of nodes each of which might have leaves or more nodes.

I thought this would be something straight out of a CS text book, but I couldn't find anything specific about serializing the data.
Tuesday, January 24, 2006
You also need to discuss if you want the saved file to be portable (need to worry about endian issues across multiple processors), do you want to save it in binary (more efficient), or some ascii format (xml, custom, etc. which might be human editable).

The method I would take would probably be to implement logic in the nodes that would write XML similar to:
  <data>(write your data value(s) here)</data>
  <left>(call any left node write routine here)</left>
  <right>(call any right node write routine here)</right>

Obviously if this is a tree with more than left and right, change those to appropriate tags. By using XML you avoid having to worry about endian issues if you port your software. Just follow the nodes in the tree, having the root node write() member function call the write() function of each child node, which in turn will call write() on their children nodes, etc.

The read is similar. You probably want to have the read() member function be static, and have it return a pointer to the created root node. It will dynamically allocate nodes as it reads the XML file, creating children nodes as the appropriate tags are encountered, and adding them to the current node. Again have the read() function call itself recursively to parse down through the tree, using the returned pointers to add items to the tree structure.

This isn't normally covered in most C/C++ textbooks that I have seen, but there are a good amount of sites on the web that talk about it and give more concrete examples. For a tree it is pretty easy, it gets a little bit more complex once you start getting into graphs with multiple paths that can be followed.
Dave Send private email
Tuesday, January 24, 2006
You could also use a library like e4Graph ( ) which was made for persistance of graph-like datastructures. It might ofcause be overkill if your tree is very simple, but then then the underlying metakit database ( ) might be just what you need.
Dave Robbins
Tuesday, January 24, 2006
I spent considerable time trying to work with Metakit a few years ago and gave up in total frustration. I wouldn't go anywhere near it.

XML is defintely the way to go. There are various serialization packages out there. Search on "serialization". Enternity is one which might be ok.

Next time put more detail in your question.
Neville Franks Send private email
Tuesday, January 24, 2006
Well we have had extraordinary results using metakit in our app, outperforming our previous implementations using berkeley db and sqlite with a large margin.

Part of this probably comes from our datastructures mapping well to metakits internal representation (using subviews), allowing us to avoid some rather awkward constructions in the other db's. But mostly I will attribute it to metakit just being faster.

But I do agree that you need to be capable of understanding the parts where metakit differs from traditional db's to get the most out of it.
Dave Robbins
Tuesday, January 24, 2006
Alternatively, do you have to save the structure itself?

Just for the sake of argument, consider doing a flat dump of the data itself. It will take equal time to write out the data, and it will take longer to read it in, but you'll have the opportunity of sorting the data as you read it.

This would be useful in cases where you cannot guarantee the integrity of the tree itself, i.e., when third party programs might manipulate the data itself. An example might be a program that manages your bookmarks.

I only bring this up because when I was in school, I had the tendency to overcomplicate saving data structures like this.
Tuesday, January 24, 2006
Check boost::serialization
Wednesday, January 25, 2006

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics
Powered by FogBugz