The Design of Software (CLOSED)

A public forum for discussing the design of software, from the user interface to the code architecture. Now closed.

The "Design of Software" discussion group has been merged with the main Joel on Software discussion group.

The archives will remain online indefinitely.

How get 'chains of records'...

Hi,

We are trying to figure out how to store and retrieve 'chains of records'.

More specifically we have courses (classes) and as students progress from one course to the next the notion of a chain of classes is created (e.g. sth like an evolution path).

Usually each course progresses into another course.
course1->course2

But courses can also merge (e.g. course1 and course2 are followed by course3, e.g. the students of both courses are joined into one new course)
course1 \__ course3
course2 /

We considered having a NextCourseID field in each course, but we would not know how to easily query to get the entire chain from any part of it.
We also considered creating a separate table (e.g. tblCourseChains) but did not know how best to implement the whole idea

We would like to be able to
- easily manage the linking of the courses
- easily retrieve an entire chain of courses knowing any part of it

We are currently using asp and an access db.

Any help would be greatly appreciated.

thanks in advance.
Aristotelis T Send private email
Thursday, July 26, 2007
 
 
I would say this calls for a connection table

Course <-PreReqRecord -> Course

PreReqRecord
CourseId
ReqCourseId

so say Course3 has a pre-req of Course1 and Course2 it would look like

Course3  <-- PreReqRecord
              CourseId=Course3
              ReqCourseId = Course1     

        <--- PreReqRecord
                CourseId= Course3
                ReqCourseId = Course2

Your queries wouldn't be that bad, just get the PreReqRecord where CourseId is the course you want to find and use the ReqCourseId to pull the Courses (bet you can be done in a single SQL statement)

Hope this helps.
EdNoWeb
Thursday, July 26, 2007
 
 
In this particular problem (and it should be in your college database textbook but a good algorithms book might have it) is to work backwards.

Visualize a tree (of N number child nodes) where the root or topmost element is either the senior thesis class or a psuedo-class symbolizing graduation.

If you have a class that is an elective, counts towards graduation but otherwise has no successors, then it's linked directly to that top root element. If you have a chain of classes with prerequisites, then following the parent(s) of each class eventually leads up to the root.

Note that a class can have multiple children (the prerequisites merge into the class) and multiple parents (taking the class makes you eligible for several classes at the next level).

The reason I suggested a database book is because the trick here is storing a node as a record with N number of parents and N number of children. (Think relationship tables.)

Traversing the nodes to identify a complete chain is something that has to be done algorithmically because of all the merges and branchings. I suppose you can do it in SQL, but it would be ugly. I'd recommend you do it in VB or C# as a custom control, at which point you're largely just dealing with linked lists.
TheDavid
Thursday, July 26, 2007
 
 
If you don't mind the extra space and denormalized data, you could store a set of paths for each course representing the "lineage" of the course.  Then, to run the queries you can use string operations (StartsWith, EndsWith, Substring, etc) to pick out the bits you need.

For example you could use something like:

Course1/Course2

and

Course1/Course3
Course2/Course3

to represent the cases you have shown above (where CourseX is really the value of the primary key associated with the course). 

Then, you can use the string functions to compute useful stuff about this arrangement of courses.

Basically what you are doing is enumerating all the possible paths through the graph of courses.  You can build these paths up pretty simply as you add nodes to the network.  When you add a new node as a child of another node, you duplicate all of the parent's paths and tack the new child's ID onto the end of each one.

You can re-arrange or augment this kind of approach to suit various kinds of needs, too.  For example, in some cases it might be convenient to store these paths in two fields like:

ParentPath        Item
Course1/Course2  Course4
Course2/Course3  Course4

to represent two ways to get to Course4.

Don't show this to a DBA, though, or they'll likely give ya a beatin.
Franklin Send private email
Thursday, July 26, 2007
 
 
"Don't show this to a DBA..."

I have to admit that's a fairly clever, if inefficient way of storing those relationships. I'm going to have to remember that as a potential interview question. Something along the lines of...

How would you build a system that tracks which classes a student needs to graduate, as well as the order those classes must appear, electives, and sequences that don't necessarily start with the freshman introductory course? To increase the difficulty, build the system so that it doesn't utilize a database, and allow for parallel and optional paths, both branching and merging, such as A-B-C and A-D-C.

Enumerating all these relationships is easy. Later searching them and modifying them, I think will be fairly difficult.
TheDavid
Monday, July 30, 2007
 
 
Basic kinds of searching are pretty straightforward and can be implemented efficiently with the use of indexes of various sorts.  Modification can be a pain, though. For example, moving a large subgraph from one point to another means updating a large number of "path" records.
Franklin Send private email
Monday, July 30, 2007
 
 
Never use a recursive foreign key - rather introduce a x_structure table with parent, child , etc columns.

More flexible for multiple hierarchies.

Search on adjacency list and Celko's nested sets trick

Tuesday, July 31, 2007
 
 
"...rather introduce a x_structure table with parent, child..."

I'm under the impression that Celko's nested sets trick speeds up the SELECT performance immensely, but is harder to modify because you potentially have to recompute all of the relationships above the insertion point. As such, Celko's solution is ideal for relatively static data.

Having said that, it's a good solution if you plan to create a "new" graduation chart every time you modify the classes offered. Err... every academic year, you would start from scratch and make a new chart instead of updating the previous chart.

Use Celko's solution if you're a university administrator and you're producing one big static chart that you can hang up on the wall.

Use the recursive foreign keys solution if you're a student and you want to dynamically modify the chart and plug in classes and remove classes and generally figure out if you want to go for this degree, that degree or both.
TheDavid
Tuesday, July 31, 2007
 
 
The nested set approach is only applicable to trees because it relies on the fact that the node ranges between distinct subtrees do not overlap.  If you take that approach, you'll need to insert duplicate nodes into the tree to account for the multiple paths through the graph.  Can get very complicated....
Franklin Send private email
Tuesday, July 31, 2007
 
 
Better work out exactly how the prerequisites work.

For example, at Thompson Rivers University:

MATH 322 has prereqs of "MATH 212 and at least one of MATH 220, MATH 307, MATH 308 and MATH 312".

MATH 365 has prereq of MATH 211, MATH 212 plus this: "Note: Students who already have credit for COMP 332 may not take MATH 365 for further credit."

Do you have co-requisites as well?

Then, we have the possibility of the course number being recycled.

Sincerely,

Gene Wirchenko
Gene Wirchenko Send private email
Tuesday, July 31, 2007
 
 

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics
 
Powered by FogBugz