techInterview Discussion

A part of techInterview.org: answers to technical interview questions.

Your host: Michael Pryor

Nested functions in C

I was asked this question in a startup interview.

Why there are no nested functions in C?

As far as I know,gcc provides nested function support as an extension. I had told the follwing answer,

func1()
{
        int i = 0;
        func2()
        {
                i = 10;
        }
}

If someone calls func2() directly, i.e. without calling func1()
then a nonexistant memory is accessed, so,BANG....

What are the other reasons that nested functions are not there in C???


I think,in most other high level languages like C++ and Java nested functions are supported, not sure though..
Pharaoh Send private email
Monday, December 25, 2006
 
 
The short answer is that the language designers chose not to implement nested functions.

There is no intrinsic reason why C cannot have nested functions. Infact, as you mention, gcc provides nested function support as a C extension.

Also, the answer you gave isn't right because nested functions would be local to the containing function by definition, and could therefore only be called from within the containing function.

The real difficulty is that a nested function would have no easy way to access the local variables of the containing function.

This is because other functions may be called before the nested function, and therefore the stack pointer would be at an indeterminate offset from the local variables of the containing function.
Sajid
Monday, December 25, 2006
 
 
The last paragraph in my reply isn't very clear.

What I meant is:

The containing function may call other functions, which in turn may call the nested function.

In which case, the stack pointer would be at an indeterminate offset from the local variables of the containing function.

A normal function would only be able to call the nested function via a function pointer passed as an argument.

Nested functions within the same containing function would be able to call each other directly.
Sajid
Tuesday, December 26, 2006
 
 
Sajid, thanks for clarification, but I think your last two explainations are contradicting, atleast few lines are...I dont think the answer which gave is wrong, isn't it saying same thing as you are saying.
Pharaoh Send private email
Tuesday, December 26, 2006
 
 
Apologies, but my explanation is not the same as the one you gave.

It might be better if I use your example:

func1()
{
    int i = 0;
    func2()
    {
        i = 10;
    }
}

> If someone calls func2() directly, i.e. without calling >func1() then a nonexistant memory is accessed, so,BANG....

This explanantion isn't correct because it isn't possible to call func2() directly outside of func1().

For example, if we had:

main() {
    func2();
}

Then this would result in an error because func2() is local to func1() by definition (nested functions are local to the containing function).
Sajid
Tuesday, December 26, 2006
 
 
What I was saying is that the problem is caused by functions like func3() and func4():

func1()
{
    int i = 0;
   
    func2()
    {
        i = 10;
    }
    
    func3()
    {
        func(2); 
    }
    
    func3();
    func4(&func2);
}

func4((*function)())
{
    (*function)();
}

It's functions like these which would cause the stack pointer to be at an indeterminate offset from the variable i when func2 is called.

By the way, I may be wrong.

Hopefully, someone else will post an opinion contribute to this discussion.

But we might have to wait a day or two because of the holidays :-)
Sajid
Tuesday, December 26, 2006
 
 
It seems to me that the difficulty Sajid points out would be resolved if the compiler used a static memory location to store a pointer to the outer function's stack frame.  The function would need to preserve the previous value of the pointer on entry and restore it on exit, to handle possible recursive calling.  It would get complicated in the presence of threads, though; perhaps thread-local storage would be needed.

I tried this program as a test, and it works correctly:

#include <stdio.h>

void func4(void (*fptr)(int))
{
  int i;
  for (i = 0; i < 3; i++)
    (*fptr)(i);
}

void func1(void)
{
  int table[3];
  void func2(int index)
  {
    printf("%d\n",table[index]);
  }
  void func3(void)
  {
    int i;
    for (i = 0; i < 3; i++)
      func2(i);
  }
  table[0] = 5;
  table[1] = 6;
  table[2] = 7;
  func3();
  table[0] = 8;
  table[1] = 9;
  table[2] = 10;
  func4(func2);
}

int main(void)
{
  func1();
  return 0;
}

Output:
5
6
7
8
9
10

I inspected the disassembled code for this example to see how the compiler implements this, and it turns out it just uses the ECX register to store a pointer to func1's stack frame.  Not sure how it would in general ensure that ECX gets preserved properly, but perhaps it uses different strategies depending on how complicated the code is.
Erick Jones Send private email
Tuesday, December 26, 2006
 
 
On closer inspection of the disassembled code, I realized that the compiler is using a marvelous trick -- dynamic code generation!

When func1 starts, it allocates some extra space in its stack frame and writes the machine language representation of the following two instructions there at runtime:

  ecx = &table
  goto func2

When we pass the "address" of func2 to func4, the compiler really passes the address of this dynamically created code segment, which "fixes up" the value of ECX.  When we call func2 directly from func1, this entry segment isn't used; instead, func1 just takes care to populate ECX properly each time it calls func2.

I hope that made sense.
Erick Jones Send private email
Tuesday, December 26, 2006
 
 
Here's an alternate means of solving this "indeterminate offset from stack frame" problem: the compiler could add a special implicit parameter to each nested function which contains a pointer to the stack frame of the outer function. Nested functions that call each other would pass along the implicit frame pointer, so that even as the call stack grows arbitrarily, each nested function would have a reference to the original outer function's stack frame and access to its variables. All of this would be invisible to the programmer.

For example, this c code

void outer() {
  int x = 0;
  inner_A();
 
  void inner_A() {
    inner_B(x);
  }

  void inner_B(int y) {
    printf("%d", x + y);
  }
}

would be compiled to this:

void outer() {
  int x = 0;
  inner_A(CUR_STACK_FRAME);
 
  void inner_A(char *IMPLICIT_FRAME) {
    inner_B(IMPLICIT_FRAME, IMPLICIT_FRAME + OFFSET(x));
  }

  void inner_B(char *IMPLICIT_FRAME, int y) {
    printf("%d", IMPLICIT_FRAME + OFFSET_OF(x) + y);
  }
}

where OFFSET_OF and CUR_STACK_FRAME are shorthand for numbers that the compiler can insert.

This idea is similar to implementing methods in object-oriented languages by passing a special "this" pointer as an implicit first parameter. In other words, a.f(x) is compiled to f(a,x). In this case, local variables of the outer function act like instance variables of an object.
CS Grad
Thursday, December 28, 2006
 
 
Yeah, but the problem is that if you pass the address of the local function to some external function, then the external function doesn't "know" that it has to pass an implicit stack pointer parameter (and wouldn't know what value to specify anyway).

In my example, how would func4 know what value to use as this implicit stack pointer parameter, or that it even needed to use one?  For all it knows, the function whose pointer is passed to it is just a normal function that takes a single integer parameter.

What's really cool about the dynamic code generation approach is that if the outer function is recursive, each pointer it creates for its local function will know about the appropriate stack frame.  For example:

#include <stdio.h>

typedef int (*myfptr)(void);

myfptr fptrs[3];

void enum_fptrs(void)
{
  int i;
  for (i = 0; i < 3; i++)
  {
    myfptr fptr = fptrs[i];
    printf("Address: 0x%08x  Return value: %d\n",
          fptr,(*fptr)());
  }
}

void recursive_function_with_local_function(int reclevel)
{
  int local_function(void)
  {
    return reclevel;
  }
  if (reclevel < 3)
  {
    fptrs[reclevel] = &local_function;
    recursive_function_with_local_function(reclevel + 1);
  }
  else
  {
    enum_fptrs();
  }
}

int main(void)
{
  recursive_function_with_local_function(0);
  return 0;
}

Output:
Address: 0xbf84e814  Return value: 0
Address: 0xbf84e7e4  Return value: 1
Address: 0xbf84e7b4  Return value: 2

Note that &local_function evaluated to something different each time, because it really points to the special entry point created to load the appropriate stack frame pointer.
Erick Jones Send private email
Thursday, December 28, 2006
 
 
"Yeah, but the problem is that if you pass the address of the local function to some external function, then the external function doesn't "know" that it has to pass an implicit stack pointer parameter (and wouldn't know what value to specify anyway)."

That's a good point. On the other hand, distributing the address of a local function externally is a bit "dangerous"--what happens if that address is used beyond the lifetime of the original outer function? The stack frame of the outer function would have been destroyed and the results of the inner function, which depend on the outer function's variables, may be meaningless.

Seems safer to me to say that the inner function can only be called within the context of the outer function.

To fix this, it seems like you would need to heap-allocate the outer function's stack frame, as done in functional languages. That seems a very distant from C semantics.
CS Grad
Friday, December 29, 2006
 
 
Indeed, the use of this feature creates possible programming pitfalls -- but no more so than the use of pointers generally.  (Think about the case where you store a pointer to a local variable and try to dereference it after the variable goes out of scope.)

C has always been a language that offers you just enough rope to hang yourself, and I think this is just another example of that.  Anyone who is worried about the pitfalls (or that just wants to write maintainable code) would probably be well-advised to avoid use of this particular feature.

Nonetheless, it is there and it works as described.

Here's a quote from the gcc info page which clarifies how this works (and in particular warns you not to use the function pointer beyond the lifetime of the outer function call):

--- quote from gcc info page ---
 It is possible to call the nested function from outside the scope of its name by storing its address or passing the address to another function:

    hack (int *array, int size)
    {
      void store (int index, int value)
        { array[index] = value; }

      intermediate (store, size);
    }

 Here, the function `intermediate' receives the address of `store' as an argument.  If `intermediate' calls `store', the arguments given to `store' are used to store into `array'.  But this technique works only so long as the containing function (`hack', in this example) does not exit.

 If you try to call the nested function through its address after the containing function has exited, all hell will break loose.  If you try to call it after a containing scope level has exited, and if it refers to some of the variables that are no longer in scope, you may be lucky, but it's not wise to take the risk.  If, however, the nested function does not refer to anything that has gone out of scope, you should be safe.
--- end of quote ---
Erick Jones Send private email
Friday, December 29, 2006
 
 

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics
 
Powered by FogBugz