The Design of Software (CLOSED)

A public forum for discussing the design of software, from the user interface to the code architecture. Now closed.

The "Design of Software" discussion group has been merged with the main Joel on Software discussion group.

The archives will remain online indefinitely.

Compiler design question

Can anyone give me any good suggestions on where to find concrete information on how to handle generics in a compiler?  The type checking, etc. that a compiler must do to generate working code.

Aaron F Stanton Send private email
Saturday, November 06, 2004
What language? Here's an interesting discussion of the differences between several languages, and as is often the case, the pythonic way is easier :^)

I suppose if you're really ambitious you could root through the source code of an actual compiler and see what's going on:
Tom H
Sunday, November 07, 2004
It's more a general question - I can see how to implement classes, interfaces, etc., but generics bug my brain a bit.

Thanks for the pointers!
Aaron F Stanton Send private email
Sunday, November 07, 2004
Well, you would presumably have a system for generating interal names for compiler-generated classes. This would encode the generic class, and the types/whatever used in this instatiation, and anything else you want.

Now each time the user tries to use a generic, work out its internal name and search your table of existing types.

If you find it, great, use that code you already generated. (This is just as if the user used a non-generic class they wrote themselves.)

If you don't find it, generate code and add it to the types list under the internal name you made earlier. (This is just as if the user typed in a non-generic class.)

As for the type checking, presumably in your syntax tree corresponding to the generic definition you'll have placeholders for the names denoting generic bits -- just replace these with whatever parameters the user supplied for this particular instantiation, and compile it.

The overall idea is to use your existing systems for compiling, with the generic bits simply expanding the generic forms into the individual instantiations in as sandboxed a way as possible.

Note -- I never wrote anything even remotely like this... It is all conjecture and armchair programming!
Sunday, November 07, 2004
That's actually a good idea that I hadn't considered.  It should work for resolving at compile time.

I wonder how to do it dynamically - might have to do runtime code generation for that.
Aaron F Stanton Send private email
Monday, November 08, 2004
The hard thing, of course, is to make error messages reasonable when compiling instantiated generic code...
Frederik Slijkerman
Saturday, November 13, 2004
The answer to your question depends on whether or not you are implementing generics like C++ templates or Java 1.5 generics.

The C++ mechanism is to create an instance of the class for each combination of template parameters and then produce distinct machine code for each instantiation.  This approach definitely needs to use an internal name to track if the template has already been instantiated with those type parameters.

Java 1.5 uses bound parameters.  The class accepted for a parameter must implement/extend a specified interface/class (the lowerbound class).  Of course, the default lowerbound class is Object.  Within the parametric class only those methods available on the lowerbound class can be used.  That means that the compiler can use the same byte code representation regardless of how the parameteric class is instantiated.  The compiler then has to inject the necessary casts into the client that uses an instantiation of the parameteric class, but only when the actual parameter type is returned.  There is no need to cast when the actual parametric class is used as a parameter because it must be assignable to the lowerbound class.  You would still need to track the exact type of the instantiation of parametric class as it is known at compile time, so that you know when to inject the casts.

If you want to know more about generic type systems, I would recommend reading _Foundations of Object Oriented Languages_.  It goes into some depth about the implications of read and write operations in a generic type system.
Tuesday, November 16, 2004
I'd suggest reading the book "modern compiler implementation in C" by Andrew Appel (ISBN 0-521-58390-X)

There's a whole chapter, called "Polymorphic Types" that covers the implementation of generics in hard details.
Hope This Helps
Wednesday, November 17, 2004
Aaron F Stanton Send private email
Thursday, November 18, 2004
I also found what appears to be a more recent version (ISBN: 0521607655) dated 2004.  It's now on my Amazon wish list. :)

I have an old copy of the Java version of the book.  I should check and see if it covers generics.
Aaron F Stanton Send private email
Thursday, November 18, 2004

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics
Powered by FogBugz