The Design of Software (CLOSED)

A public forum for discussing the design of software, from the user interface to the code architecture. Now closed.

The "Design of Software" discussion group has been merged with the main Joel on Software discussion group.

The archives will remain online indefinitely.

Coding Style Argument

For most of my programming career, I've written code like this:

void debug(const char *msg)
{
    FILEMANAGER *fmi;
    MYFILE *fp;

    fmi = FILEMANAGER_GetHandle();
    if (fmi == NULL) {
        alert("FILEMANAGER_GetHandle() returned NULL!");
        return;
    }
    fp = FILEMANAGER_OpenFile(fp,"foo.txt",NEW_APPEND);
    if (fp == NULL) {
        alert("FILEMANAGER_OpenFile ('foo.txt') failed!");
        FILEMANAGER_ReleaseHandle(fmi);
        return;
    }
    MYFILE_Write(fp,msg,strlen(msg));
   
    MYFILE_Close(fp);
    FILEMANAGER_ReleaseHandle(fmi);
}


We only branch/introduce new scope when something out of the ordinary happens.  Otherwise, the common case runs straight down.  The UNIX world has a lot of this.

In the Windows world, I see a lot more of this:

void debug(const char *msg)
{
    FILEMANAGER *fmi;

    fmi = FILEMANAGER_GetHandle();
    if (fmi != NULL) {
        MYFILE *fp = FILEMANAGER_OpenFile(fp,"foo.txt");
        if (fp != NULL) {
            MYFILE_Write(fp,msg,strlen(msg));
            MYFILE_Close(fp);
        } else {
            alert("FILEMANAGER_OpenFile ('foo.txt') failed!");
        }
        FILEMANAGER_ReleaseHandle(fmi);
    } else {
      alert("FILEMANAGER_GetHandle() failed!");
    }
}


I consider this approach inferior, but I can't really say why in a way that would convince anyone else.

Comments?
Michael B
Monday, June 20, 2005
 
 
The second example is argued to be superior because it has only one return statement.

If I am debugging your code, and would like to print a return result at the bottom of the function near the return, then all cases will hit it.

While you have sprinkled alert() calls appropriately, this does not always work well - like say in a small embedded appliance where extra static text may not be tolerable.

One could argue that your case #1 is more efficient, but only an inspection of the compiler generated code would be convincing. Here, I have looked at similar code and found no differences with our ARM embedded compiler in terms of efficiency in styles.

I am pretty sure that Scott Meyers in his Effective C++ books says prefer single return statements per funtion - but as I keep that book at home and not at work, I cannot say for sure.

Doesn't MConnell also make such a statement?
hoser Send private email
Monday, June 20, 2005
 
 
Yes, I like the first approach.  In that one, once you've reached a certain point in the code, you know you've successfully navigated whatever errors could have happened up till then.  The nesting level does not depend on how many steps there are in your file-open process, or whatever.

With the Windows example you gave, you have to 'hold in your head' what was checked, what error could have happened, and how nested you are.  This works OK if there are three conditions or less.  If there are much more than that, then it nests deeper and deeper, and it becomes very hard to keep straight what error report is tied to which error.

And purists will probably say, use try-catch blocks, or raise error events, or some other C++/Java-ish work-around.
AllanL5
Monday, June 20, 2005
 
 
The single exit point people have a religion about it.

I find the tortured logic you need to maintain one exit point extremely confusing, as your second example shows. The == NULL cases are far away from where the test is done so it's nearly impossible without extra mental processing to know what the line of code is for.

The argument goes, what if you make a change, won't you forget to close something?

In C++ definitely not. Introducing new scopes makes a lot more sense because of the use of destructors in resource management.

In C, using your style, of verifying all arguments before use, then you won't easily leak on news changes, because the code is close together. If you introduced a lot of code between freeing the resource then introducing a leak would be easy.

You don't initialize your stack points to 0, which I don't care for.

And I would still prefer your code to do:
 MYFILE *fp = FILEMANAGER_OpenFile(fp,"foo.txt");

As it is clearer in my mind than storing up useless variable declarations at the beginning of a function.
son of parnas
Monday, June 20, 2005
 
 
Though I understand the single exit point argument, I don't usually prefer it.  You can make the argument that you might want to toss a print or something in there, but it's a situation that doesn't actually come up that often in my experience.

I generally prefer my logic to get where it's going as simply as possible then return, and if there's more than one return in the process so be it.  If there's one thing that's clear-cut in C/C++ it's "return" so I personally consider it to be a clearer approach than the logical gymnastics necessary to have a single return at all times.

Monday, June 20, 2005
 
 
In the absence of destructors / try-finally blocks, I would actually argue that the second version is superior, because you don't have to think as hard about resource cleanup.

Consider a function that uses three different Things, that each need to be closed. Using the first variant, the function will look vaguely like this:

void FirstStyleUsingThings( )
{
  Thing *thing1 = GetMeAThing();
  if( thing1 == null ) return;

  Thing *thing2 = GetMeAThing();
  if( thing2 == null )
  {
    thing1->Cleanup();
    return;
  }

  Thing *thing3 = GetMeAThing();
  if( thing3 == null )
  {
    thing2->Cleanup();
    thing1->Cleanup();
    return;
  }

  DoStuffWithThings(thing1, thing2, thing3);
  thing3->Cleanup();
  thing2->Cleanup();
  thing1->Cleanup();
}

Consider now the following second style:

void SecondStyleUsingThings()
{
  Thing *thing1 = GetMeAThing();
  if( thing1 != null )
  {
    Thing *thing2 = GetMeAThing();
    if( thing2 != null )
    {
      Thing *thing3 = GetMeAThing();
      if( thing3 != null )
      {
        DoStuffWithThings();
        thing3->Cleanup();
      }
      thing2->Cleanup();
    }
    thing1->Cleanup();
  }
}

I find the latter immensely easier to read, understand and maintain, since cleanup code isn't scattered & repeated throughout the function.

Strangely enough, the first version can be considerably cleaned up with the use of gotos.

Having said that, I think that my preference for the second style (which I saw referred to as "The Wedge" on some web site somewhere) only applies in languages without deterministic destructors of some sort (C++ destructors, C# using statement, try-finally, etc.). In those languages, I'd much rather just do this:

void TheRightWay()
{
  ThingRef thing1( GetMeAThing() );
  ThingRef thing2( GetMeAThing() );
  ThingRef thing3( GetMeAThing() );
  DoSomethingWithThings( thing1, thing2, thing3 );
}

and let the implementation of ThingRef do the work of figuring out wether to throw an exception, and do the appropriate cleanup.
Chris Tavares Send private email
Monday, June 20, 2005
 
 
Agreed, Chris, your style is clean.  However.

It does assume that the 'Thing1->Cleanup()' call will be the same, no matter WHAT goes wrong with Thing2 or Thing3.  IF that is a correct assumption, then your code will work.  IF that is NOT a correct assumption, then some kinds of work-around decision making may have to be done regarding how Thing2 and Thing3 failed.

IF the cleanup() call changes based on what worked, then the first style is still cleaner (and easier to verify), even at the price of duplicating xxx->Cleanup() calls.

I could also argue that usual cleanup() types of code for things that worked happens in the destructor (or when the variable goes out of scope as the routine is left) -- but I don't really like that argument.

I guess I conclude by still preferring the first model.  You know what you have at each point in the code.  'Cleanup()' activities make no assumptions about what worked.
AllanL5
Monday, June 20, 2005
 
 
Allan?  Why on earth do you prefer the first style?

If you were consistent about needing direct personal knowledge as to every last aspect of the code, then the whole "assembler" thing should be making you nervous due to it being such a frighteningly high level of abstraction. The idea of going as far as using an architecture-astronaut ultra-high-level langauge such as C should be giving you a serious panic attack, and anything higher than that would probably count as a lethal weapon.

Besides, if you're writing spaghetti code where the cleanup() function depends intimately on the setup() function of a different module that was called several lines previously then I'm glad I don't work with you, anyway.

Monday, June 20, 2005
 
 
Wow, an anonymous poster who takes issue with me, asking me WHY I prefer the first one, when my current answers clearly explain WHY I prefer the first one.  Golly.

It localizes error handling.  It is easier to understand.  It is easier to join the error handling and recovery with the condition that detected it.  It is less prone to maintenance mistakes in the future.  It minimizes the nesting of IFs.  And yes, if I OPEN a file handle, somewhere down the road I expect I'll have to CLOSE the file handle.

So, I guess I'm glad I don't work with you, too.  Isn't nice we agree about something?
AllanL5
Monday, June 20, 2005
 
 
If I may...

The single return "zealot" (which I am not) would argue the following (if I may be so bold).

That first, since the function returns void it does not adequately illustrate the usefulness of the idea.  I think the single point of return would code like this (assuming POSIX like errno):

// returns number of bytes written, -1 if error
int foo(const char *msg)
{
  int error = 0;
  int nwrite = -1;

  FILEMANAGER *fmi = NULL;
  if( !error ) {
    fmi = FILEMANAGER_GetHandle();
    if( !fmi ) {
      error = errno;
    }
  }

  MYFILE *fp = NULL;
  if( !error ) {
    fp = FILEMANAGER_OpenFile(fmi,"foo.txt");
    if( !fp ) {
      error = errno;
    }
  }

  if( !error ) {
    nwrite = MYFILE_Write(fp,msg,strlen(msg));
    // Not necessary, but useful if the logic is to
    // be extended. A good compiler will notice that
    // error is no longer used and remove this check
    // and error assignment.
    if( nwrite == -1 ) {
      error = errno;
    }
  }

  if( fp ) {
    MYFILE_Close(fp);
  }

  if( fmi ) {
    FILEMANAGER_ReleaseHandle(fmi);
  }

  return nwrite;
}


1. This avoids "the wedge" and huge indentation.
2. Any time an error can occur, it is easy to spot. And it is easy to spot when the author may have missed an error condition.
3. If additional code must be added or removed, it is easy to do so and maintain the error handling logic.

I find this method useful when there are multiple resource failures that may occur.

Personally, I use whatever style fits the bill. If the function is getting difficult to understand then it likely is not performing "one thing" and should be split into simpler pieces.
hoser Send private email
Monday, June 20, 2005
 
 
I'm always puzzled when I see someone assert this "single exit point" thing.

As I understand it, the idea isn't that you're not allowed to have a Return statement except at one point in the function.  Rather, the idea is that no matter how many ways out of the function there are, they should all be going back to the same place.  That is, you shouldn't have GOTOs, or some other way of branching out of a function instead of going back where you started.

The idea is to avoid the old kind of programming full of GOTOs where you're never sure how you got where you are now.  Multiple Return statements don't cause that problem, because the only place you can get back to is the place you came from.
Kyralessa Send private email
Monday, June 20, 2005
 
 
I'd like to like your example, hoser, except you never actually return 'error', no matter what works or what goes wrong.  Did you leave something out?
AllanL5
Monday, June 20, 2005
 
 
Ah, but it occurs to me now that none of the examples ever returns an error code.  Hmm.

'hoser's solution uses the 'error' flag to decide when to continue processing.  It certainly is better than the "Wedge".

Personally, I tend to want to return an error code, to give the caller some indication that the call worked.  Multiple 'return' points lets you return a different code for each error point -- of course, so does 'return(error)'.

Neat solution.  I'll have to think on it when I'm not so tired.
AllanL5
Monday, June 20, 2005
 
 
And Kyralessa, I completely agree with you.  The problem with multiple return points from a routine is finding out why it exits at each one.  The "First Pattern" the OP posted makes it very easy to determine why each one exited.

And as has been pointed out, you could remove the multiple return points with a 'goto' to the end of the routine -- but you are already achieving that more simply with the multiple return statements.

Personally, I find judicious use of multiple 'return' points for error handling quite effective.  Error handling is always one of those things that prompts 'gotos', or other 'bail out!' kinds of code structures.  The multiple 'return' approach seems the cleanest way, for the reasons stated above.
AllanL5
Monday, June 20, 2005
 
 
No, the reason for not returning error is:

1. The function "debug" is one that writes bytes to a consumer of some kind. Most POSIX functions that perform a write return the number of bytes written or -1 for error. When in POSIX land, do likewise. If this were Win32 land, then returning an HRESULT would have been the norm and the number of bytes written would have been passed back through a pointer to DWORD.

2. Since I wanted to follow convention, returning error was not the norm. The errno state would have been caught by the caller since they check for the number of bytes written and see that its -1.

3. The local error still performs its task dutifully by keeping the error state consistent.
hoser Send private email
Monday, June 20, 2005
 
 
I'm not promoting this as any one true way.  That's someone else's business that I care not to get involved in.

I'm merely countering the OP's suggesting that this is not an acceptable way and he is essentially looking for a fig leaf to hide behind in order to declare it unacceptable.
hoser Send private email
Monday, June 20, 2005
 
 
How about this then. It is a variation of the 2nd and keeps the "error handling" close to the cause. Instead of "check"  one can of course come up with "assert*" names or "isNonNull" etc.:

void debug(const char *msg) {
    FILEMANAGER *fmi;
    MYFILE *fp;

    fmi = FILEMANAGER_GetHandle();
    if (check(fmi, "FILEMANAGER_GetHandle() returned NULL!")) {
        fp = FILEMANAGER_OpenFile(fmi,"foo.txt",NEW_APPEND);
        if (check(fp,"FILEMANAGER_OpenFile ('foo.txt') failed!")) {
            MYFILE_Write(fp,msg,strlen(msg));
 
            MYFILE_Close(fp);
        }
        FILEMANAGER_ReleaseHandle(fmi);
    }
}

int check(void *ptr, const char* msg) {
    if (ptr == NULL) {
        alert(msg);
        return 0;
    }
    return !0;
}
icing Send private email
Tuesday, June 21, 2005
 
 
How 'bout them exceptions?
slava Send private email
Tuesday, June 21, 2005
 
 
If I reacall, McConnel is not dogmatic about the "return once" approach.

My observations:

1) For short functions, the returns "in the middle" are not that confusing.

2) The "classic" way of returning once at the bottom often leads to excessive indentation and confusing code. This code is much harder to refactor.

3) Oddly, this is one place where "gotos" work really well. When an error occurs, set an error state and jump to the end of the function to perform cleanup. This avoids the annoying excessive indentation and is easy to follow and maintain. McConnel approves of this approach. One thing that makes this work better is to set some uninitialized/initialized state for each object. For example, set the file pointers to NULL and closes them only when the are not NULL.
somebody else
Tuesday, June 21, 2005
 
 
------------------------8<---------------
Using the first variant, the function will look vaguely like this:

void FirstStyleUsingThings( )
{
  Thing *thing1 = GetMeAThing();
  if( thing1 == null ) return;

  Thing *thing2 = GetMeAThing();
  if( thing2 == null )
  {
    thing1->Cleanup();
    return;
  }

  Thing *thing3 = GetMeAThing();
  if( thing3 == null )
  {
    thing2->Cleanup();
    thing1->Cleanup();
    return;
  }

  DoStuffWithThings(thing1, thing2, thing3);
  thing3->Cleanup();
  thing2->Cleanup();
  thing1->Cleanup();
}

Consider now the following second style:

void SecondStyleUsingThings()
{
  Thing *thing1 = GetMeAThing();
  if( thing1 != null )
  {
    Thing *thing2 = GetMeAThing();
    if( thing2 != null )
    {
      Thing *thing3 = GetMeAThing();
      if( thing3 != null )
      {
        DoStuffWithThings();
        thing3->Cleanup();
      }
      thing2->Cleanup();
    }
    thing1->Cleanup();
  }
}

------------------------8<---------------

Why not this way, which has no returns and indents to one level only:

void ThirdSyleUsingThings()
{
  Thing *thing1 = GetMeAThing();
  Thing *thing2 = GetMeAThing();
  Thing *thing3 = GetMeAThing();

  if (thing1 && thing2 && thing3)
  {
    DoStuffWithThings();
  }

  if (thing1) thing1->Cleanup();
  if (thing2) thing2->Cleanup();
  if (thing3) thing3->Cleanup();
}
William Rayer Send private email
Tuesday, June 21, 2005
 
 
Because, William, in the example to create a Thing2 required a good created Thing1, and to create a Thing3 required both a good created Thing1 and Thing2.

It's the coupling between these three, the possible error combinations, and how you handle it, that makes an easy to follow solution, or a 'wedge' of indented depth.
AllanL5
Tuesday, June 21, 2005
 
 
That style would work for this specific example, but in general, acquiring thing3 often depends on successfully having acquired thing2 & thing1 first, so you just can't blindly get thing3 without checking that thing1 and thing2 are already set up.

Also, you clean up in the same order as acquisition; long years of C++ tells me that you should release in the reverse order of acquisition. ;-)
Chris Tavares Send private email
Tuesday, June 21, 2005
 
 
For those who asked :
Code Complete 2, Chapter 17.
"Multiple returns can enhance a routine's readability and maintainability, and they help prevent deep nested logic. They should, nevertheless, be used carefully."
Pakter
Tuesday, June 21, 2005
 
 
>  They should, nevertheless, be used carefully.

Isn't that automatically appended to every suggestion?
son of parnas
Tuesday, June 21, 2005
 
 
Once again, I am becoming acutely aware that exception handling blocks are a true boon to both readability and reliability of code. ;) Too bad pure C does not support any, but there should certainly be some extensions which supply C with exception handling?
ping?
Tuesday, June 21, 2005
 
 
Since nobody's brought it up so far, allow me to interject Alexandrescu's ScopeGuard:

http://www.cuj.com/documents/s=8000/cujcexp1812alexandr/

Helps a *ton* with resource cleanup, especially when dealing with legacy APIs and you don't want to write RAII wrapper after RAII wrapper ad nauseam.

My vote's for the first approach (multiple exit) as well.  The very first file of "professional" code I ever had to work with was some old, old C that nested its logic, I jest not, ten levels deep.  The scar remains.
Mike Bland Send private email
Tuesday, June 21, 2005
 
 
On the whole I try for a single exit when it makes sense, not because of having somewhere convenient to stick print statements but because, for that function or method, it reduces the number of paths through the code.

Path reduction and branch reduction simplifies the code and reduces the false positive bugs that can happen when you have over complicated paths.

That said when you see code (regardless of how many exits it has), that has considerable complexity wrapped up in a switch or case statement then you know that set of code has not been factored properly.  Factoring the code reduces the complexity at that point.
Simon Lucy Send private email
Thursday, June 23, 2005
 
 
Well, Simon, isn't a 'return' in the middle the same as a 'goto' a return at the end?

In that case, it doesn't really simplify the number of paths through the code.

On the other hand, having 'return' statements in the middle of a routine does mean you cannot take it for granted where the routine ends.  If it otherwise simplifies the paths through the code, I don't think that's necessarily a bad thing.  In any case, to understand or verify the code you'll need to trace every path and condition through it -- allowing returns in the middle doesn't change that.

Note also that these 'lots of decisions, each with side effects' types of routines are also known as 'transaction centers'.  These tend to be few, and long.  I usually allow a waiver in my style guides for these sorts of things, since there tends to be no non-complicated way of doing it.  The 'linear' "try this and return on failure" pattern at least localizes the complexity.

The 'wedge' approach quickly becomes un-wieldy, with many hard to notice side-effects, as the rest of this thread has shown.
AllanL5
Thursday, June 23, 2005
 
 
Keeping in mind that one of the big goals in programming is to manage complexity, I prefer mid-function Returns.  The reason is that each one is a condition I can take for granted in the rest of the function; I think of them as "one less thing to worry about" as I go through the function.

For instance, if my function does something to the SelectedItem of a ListView, then one thing I have to keep track of is whether my ListView actually has an item selected or not.  But if at the very beginning of the function I said to Return if a ListView item isn't selected, then for the rest of the function I don't have to worry about it; I can take it for granted that we have a SelectedItem to work with, and I can focus more on the other aspects of the function.
Kyralessa Send private email
Thursday, June 23, 2005
 
 
Kyralessa:  Ditto.

To obfuscate what you said slightly, an early return absolutely enforces an invariant necessary for the correct functioning of the rest of the function.  With the "wedge", it's easy to lose track of which invariants are in effect at which level of nesting.

Or, like you said, it's less stuff to remember.  :-P
Mike Bland Send private email
Friday, June 24, 2005
 
 
Darn, Mike, that's beautiful.  Can I use it?  If I can come up with an explanation that uses words like "invariant", then my customer's don't question it.
AllanL5
Friday, June 24, 2005
 
 
I'd be honored.  :-)

Guess customers who pay the big bucks want some big words in return, even if it's unlikely they understand them...  Must give 'em warm fuzzies that it's money well-spent.  :-P
Mike Bland Send private email
Saturday, June 25, 2005
 
 
> In that case, it doesn't really simplify the
> number of paths through the code.

Doesn't it though?

If you jump to the end of routine then you can localize all the cleanup at the end, assuming you have an easy test like a comparison to null to determine if a resource was initialized.

When you don't have destructor based resource reclamation I think this is a better approach then the single return or trying to cleanup on each early return.
son of parnas
Saturday, June 25, 2005
 
 
Hmm.  What I meant was a single exit point does not reduce the number of paths through the code -- a McCabe Metric sort of thing.  'son of parnas' does raise the point of how much processing is on each of those paths, though.  Having a single exit point means (in theory) that you only have one set of 'clean-up' code.

But as has been earlier mentioned, each error path can have a different set of clean-up code -- so you may be 'over-simplifying' your code in that case.

I find the complexity of code is related to the number of paths through a routine, and keeping in mind "Now, what is the condition of the processing at THIS point".

Error 'return' statements let you keep this invariant, as was said above.
AllanL5
Sunday, June 26, 2005
 
 
> , each error path can have a different set of clean-up code

Yep, there's no one always right answer.
son of parnas
Sunday, June 26, 2005
 
 
I'd like to add that perhaps it's a matter of what you are checking or keepting track of.
 
If it's pre conditions for being to perform the task - such as the list item being selected - then a return up front seems neater (like an exception) and should have no ill side effects to worry about.

progressing onto resource allocation where you need to say open a file, and then manage that resource by closing it in the same function.  this is where function blocks become more useful to manage resource deallocation.
but they only work if it's within the scope of the one function, not whemn you are allocating and leaving scope.

in procedural code I would often follow both patterns described.
 
void writeComment( char* comment, char* filename )
{
  FILE * log = null;

  if( null == comment )
    return;

  if( null == filename || filename[0] == '\0' )
    return;

  if( log = fopen(filename) )
  {
    printf( log, "%s\n", comment );
    fclose( log );
  }
  else
  {
    // log file open error
  }
}

whatever you expand the printf to, the opened file is closed properly unless you crash the app.
Gorf Send private email
Wednesday, June 29, 2005
 
 

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics
 
Powered by FogBugz