The Design of Software (CLOSED)

A public forum for discussing the design of software, from the user interface to the code architecture. Now closed.

The "Design of Software" discussion group has been merged with the main Joel on Software discussion group.

The archives will remain online indefinitely.

Two C questions (linked lists and casting to and from void *)

Ok, I have two quick C questions.

As I see it, there are really two different practical ways to implement liked lists (doubly, in my case). The first is to create an abstract linked list type, then an abstract type for list elements containing a pointer to something and pointers to the next and previous elements. The other way of doing it would be to create an abstract linked list type, then for each type of struct you plan on storing in the linked list, add "next" and "previous" members to it. In other words, skip the element abstract data type and store the next and previous pointers right in the structs themselves.

The first method is more flexible, because you can store a pointer to really anything in an element struct, and use the same routines to manipulate it regardless of what type of thing it encapsulates.

The second method is nice, though, because you can skip the container and get straight to the struct. It's also polymorphic. If a function returns, say, the first element from one of these lists, you can use the return value as multiple objects or as a single object:

 HTTP_Request *request;
 HTTP_Request *requests = getRequests();
 
 request = requests;
 while (request) {
    /* Do something with each "request": */
    ...
    request = getNextRequest(request);
 }

or:

 HTTP_Request *request = getRequests();

 /* Do something with the first "request": */
 ...

But since the structs themselves make the chains in the list, you end up needing to use macros to manipulate it.

My question, as someone who hasn't been doing this too long, to the experienced JoS developers, who have been doing it for long, is which method do you prefer and why? As users, which type of lists would you rather work with? Are both acceptable?

My next questions is about "void *" pointers. It was my understanding that in C, casting to and from "void *" was done automatically, not requiring an explicit cast operation. But after writing some glue code in XS to link my C library to Perl, I noticed that in most, maybe even all of the XS example code, pointers are cast to and from "void *" explicitly. I know this is (sort of?) required in C++, but not in C, right? Should I be doing the same thing? Is just good programming practice, or is it a waste of time in C?

Thanks.
The Spirit of São Paulo
Friday, October 01, 2004
 
 
The old names for the first and second types of linked lists are exogenous (first - pointer to data) and endogenous (second - embedded data). The new names are non-intrusive and intrusive data structures.

I prefer endogenous data structures myself and even half wrote a C library to support them but they're very unfashionable which is why they got renamed intrusive.

Exogenous libraries which don't allow back pointers can be much slower if your data nodes are in more than one list at a time.
Wayne Hayter Send private email
Saturday, October 02, 2004
 
 
Explicitly casting to void is your way of telling the next person who reads the code that you meant to do it that way.
Tom H
Saturday, October 02, 2004
 
 
Explicitly casting to void also allows the C code to be compiled by a C++ compiler.
Christopher Wells Send private email
Saturday, October 02, 2004
 
 
I prefered endogeous because exogenous require two heap allocations (one for the item being pointed to, and another for the list item element) instead of one. However endogenous doesn't allow the items to reside on more than one list at a time, and doesn't work for structs that don't contain the endogenous pointers (endogenous is for structs that are specifically designed to be contained on one list at a time).
Christopher Wells Send private email
Saturday, October 02, 2004
 
 
My half-completed library did actually handle endogenous nodes being in more than one list at a time or even in a list and a has table or a list and a priority queue. It just used to store the offset of each embedded list node in the global information for that list; when the library was manipulating a list it would zoom in to the embedded data structure, do the list operation and when it had finished it would zoom out to the full node.


A lot of exogenous libraries don't allow access to the actual list nodes and prevent you from keeping inverse pointers. When you delete a node from it, you supply the data node and they have to step most of the way through to search for the list node that points to your data node. Whereas if you can supply the list node pointer the operation is constant time (very fast).
Wayne Hayter Send private email
Saturday, October 02, 2004
 
 
In C:


struct ListElement {
 ListElement * previous;
 ListElement * next;
 void * val;
};

typedef ListElement* List;

List next(List this) {
  return this->next;
}

List previous(List this) {
  return this->previous;
}

void * val(List this) {
  return this->val;
}

In C++:

class List {
  public List& next() const = 0;
  public List& previous() const = 0;
}

class DLinkedList : public List {
  List* next;
  List* previous;

  protected List() {
    next = NULL;
    previous = NULL;
  }

  public List& next() const {
    return *next;
  }

  public List& previous const {
    return *previous;
  }
}


class MyDLinkedList : public DLinkedList {
  ...
}

or

template TList<T>: public List {
  T val;
  T& val() const {
    return val;
  }
  T val(const T& right) {
    val = right;
  }
}
Dino
Sunday, October 03, 2004
 
 
1) The context should let us decide if we should use the exogenous or endogenous method.
Under the following situations, we will turn to exogenous structures:
- The implementation of the linked list *should* be concealed from the users. (ADT) For e.g. you write the code and give to other team(s) who might make use of it.
- The List should be applied on *MULTIPE TYPES*. Meaning, you might have been encountered with the requirement of developing list of Requests, Responses, Screens, etc.

The greatest advantage of exogenous structure lies in the second point. (This could be easily obtained using templates in C++ without compromising efficiency).

The disadvantage being, there is a kind of efficiency compromise is done due to pointer redirection for every node. Another disadvantage is the comparison operation *SHOULD* be performed by a function outside the ADT implementation. This is applicable depending on your system. In a system where bitwise comparison would suffice, this is not an issue, but most of the real-time systems are not like that.

On the other hand, the following factors would let us decide in favour of endogenous structures:
- It is possible for us to replicate the code for different types. You have only one or two types for which the list implementation should be done.

Unlike the earlier case, there might not be much of efficiency  compromise. And since the code is replicated, at the time of implementation itself you would be knowing the compartor to use, for searches.

The disadvantage is bloating of the binary size. And if you wish to perform some fix, you would be doing that in all the replicas of the code.

2) Yes. C does not mandate explicit casting to and from void pointers. For e.g. the following piece of code compiles in WorkShop Compilers 5.0 in Solaris.
int main() {
  int i;
  int *p;
  void *v;
  v = &i;
  p = v;
}

If you found out places where explicit casting has been done, it must be some good-willed programmer who wanted to make things explicit. Another advantage is this code can be compiled in C++ without much of effort in terms of casting.

Roy.
Roy
Wednesday, October 13, 2004
 
 

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics
 
Powered by FogBugz