The Design of Software (CLOSED)

A public forum for discussing the design of software, from the user interface to the code architecture. Now closed.

The "Design of Software" discussion group has been merged with the main Joel on Software discussion group.

The archives will remain online indefinitely.

Multiple-parent tree-like structures in C?

Can anyone give me links and/or suggestions about tree-like data structures which allow nodes to have multiple parents?

I maintain a product which is written in C and which (among other things) manipulates hierarchical data. Currently, each node can have only one parent.

When I inherited the code, it had one huge unsorted array containing thousands of node-parent pairs. If you wanted to get all the descendants of a given node (which you frequently did), you had to iterate over the whole array looking for entries whose parent matched, then recursively iterate over the whole array AGAIN for every child, looking for its children, and so on. It was HIDEOUS.

I re-implemented it as a tree with links to Parent, FirstChild, and NextSibling (so that all the children of a parent could be represented by a linked list). Now, to get all the descendants of a parent, you just walk the linked list, recursing to FirstChild of each node.

However, now they want nodes to be allowed to have multiple parents, and I'm not sure how to represent that. If the children of a node are a linked list (as at present), that assumes a child node has one unique NextSibling, rather than the same child node appearing in several different sibling lists corresponding to its several parents. If I make the children an array instead (so no links between siblings), it becomes much harder to arbitrarily add and remove them in the middle (they're ordered). And I can't have several copies of the node in question, one in the linked list of siblings for each parent, because we do various things with these objects (getting read/write locks on them, looking them up in hashtables, etc) which require them to be unique and not have lots of copies of them around the place.

Thanks!
Rachael
Monday, April 24, 2006
 
 
They wouldn't really be "parents" - now you no longer have a tree - potentially a node could be self-linked (it could be its own descendant)

I would revisit the requirements.
Cade Roux Send private email
Monday, April 24, 2006
 
 
This is graph theory:

http://en.wikipedia.org/wiki/Graph_theory

Trees are just a specialied form of graphs.

The short answer is that each vertex (or node) needs to track both it's parents and children.  This opens up a whole can of worms when you start walking the graph and process it though.  Usually means maintaining a list of visited nodes and checking for a hit on your traversal routine.
Grant
Monday, April 24, 2006
 
 
Second the "revisit the requirements" comment.  I like your "Parent, Child, Siblings" structure, as it's relatively simple to create arbitrary width tree-nodes with it.

But "Multiple Parents"?  It sounds like overkill, and opens the door for lots of very difficult traversal issues.  What structure are they trying to model?
AllanL5
Monday, April 24, 2006
 
 
A very important question is whether nodes can be their own parent/ancestor.

If this is not the case, then you have a directed a-cyclic graph or DAG for short. Trees are a subset of DAGs. Although more complex than trees, they have many useful properties that can be used in alogrithms.
Karel Thönissen Send private email
Monday, April 24, 2006
 
 
Thanks for the help!

(I'd looked around Wikipedia on data structures, but only under Trees - until Grant's post I hadn't actually twigged that it was now a graph, and I was thinking of it as a tree with complications.)

As for the requirements: This is a hierarchy of categories for classifying things into. Until now, each category had a unique parent; but now I'm trying to allow a category to exist under multiple parents.

It's like a directory tree in a filesystem in many ways. So I'm wondering if it's worth trying to imitate the behaviour of symlinks: a file exists in only one directory, but there are "dummy" files in other directories which point to it and pretend to be it for most practical purposes. I could have a parent category with a linked list of siblings, some of which are genuine, others of which contain no information except a pointer to where that node "really" is. I'm wondering if that would be simpler than bringing in adjacency matrices and graph-theoretic algorithms, which might be overkill when probably 95% of customers are going to continue using the old single-parent tree.
Rachael
Monday, April 24, 2006
 
 
Seperate the tree-heirarchy from the data storage.  Then each item in the tree is just a pointer to an item in the data storage.  Use a reference counter for each item so that you can remove the item from storage when all references from the tree are gone.
Brent Send private email
Monday, April 24, 2006
 
 
It may just be me, but it sounds like someone is inventing a transactional relational database. In C. In 2006.

This makes me sad.

Tuesday, April 25, 2006
 
 
Yeah, a "Hierarchy of categories for classifying things into" probably doesn't need multiple parents.  Applying multiple inheritance at this point sounds like one of those ideas somebody had like "Hey, we allow multiple kids, right?  What if we allow multiple parents, too!  Wouldn't that be cool!"

Meaning it sounds like they're adding a 'feature' that they don't need, that makes the solution space MUCH larger and more complicated than it needs to be, and that they haven't thought through all the implications and unintended consequences.

For instance, I don't think even the Greek's Genus-Species hirearchy allows multiple parents.
AllanL5
Tuesday, April 25, 2006
 
 
Even if a child with a particular name can appear at multiple places on the graph, it doesn't necessarily mean that they're the same node.

For example, take a look at this tree:

media
...books
  ...fiction
      ...comedy
      ...mystery
  ...nonfiction
      ...history
      ...science
...cds
  ...music
  ...audio-books
      ...fiction
        ...comedy
        ...mystery
      ...non-fiction
        ...history
        ...science
...dvds
  ...narrative
      ...comedy
      ...mystery
  ...documentary
      ...history
      ...science

Even though this tree appears to allow a child to exist at multiple locations in the tree (and, therefore, to have multiple parents), I would content that each of those child nodes is defined not simply by its own name, but by the full path from trunk to leaf node. So a node called "media.books.fiction.comedy" is a completely different node than one called "media.dvds.narrative.comedy".
BenjiSmith Send private email
Wednesday, April 26, 2006
 
 
We had a usage pattern something like this - not sure it's relevant, but I'll relate it below:

The categories remained a proper tree.  HOWEVER, another table of items can be categorized in categories in separate branches (and they are automatically categorized all the way up the tree).

My example is a helpdesk system.  Say you have a category path: OS...Windows...Windows 2000, and another category path Hardware...Printers...HP...Laserjet6, and another category path for your software Software...WidgetMeister...Reports...Widget List

If a call which only affects HP LJ6 on the Widget List report on Windows 2000 comes in and needs to be categorized, the user picked the ultime relevant child nodes.  The software then makes entries in a many to many link table which links that call to the Windows 2000, Laserjet6, and Widget List categories and a process (T-SQL trigger in this case) goes through and populates the ancestors (skipping any shared ancestors).

The main reason that the ancestors are always populated is that you can report on all calls related to Windows 2000 without having to do recursive queries on calls categorized only to children to determine the implicit categorization.

Anyway, not sure it's relevant, but we've found it to be an excellent way to work (we categorize calls and bugs using the same tree, so you can even do correlations).
Cade Roux Send private email
Wednesday, April 26, 2006
 
 

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics
 
Powered by FogBugz