The Design of Software (CLOSED)

A public forum for discussing the design of software, from the user interface to the code architecture. Now closed.

The "Design of Software" discussion group has been merged with the main Joel on Software discussion group.

The archives will remain online indefinitely.

How many (functions/methods) per source file ?

Forgive me if this has been discusses before, but I could not find any previous discussion that covered what I am looking for.

This is a semi-rant about a pet peeve: multi-thousand line long source code files containing dozens of functions (C) and/or methods (C++).

Out of habits for which I cannot recall the source (whihc is why I am asking here), I prefer to put one and only one function/method per source file.

I am looking for good reasons for this practive that I might present in an effort to improve the project I am currently working.

Understanding the code in smaller chunks at a time, modularity, and easier code reuse come to mind.

Anyone else have a few clues to offer ?  Thanks.
Dan White Send private email
Friday, September 09, 2005
 
 
One method per file? That's extremism in the other direction.
YaYaYa Send private email
Friday, September 09, 2005
 
 
I like to have one header file and one source file per class.  This makes things quite easy to keep track of.

My current project has a little over 200 header files and a similar number of source files.  My longest source file is just shy of 36k, my shortest is a little over 1/2k.  I rather like the tool, cccc, the C and C++ Code Counter, which can produce useful information on when one of your classes has become too complex.  I haven't run this tool in quite some time but I'm sure it'd have some complaints.

I think you are going _way_ too far the other way.  My project currently has 38,569 lines of code (according to sloccount).  According to your process of breaking every method out into a separate file, I'd need approximately four THOUSAND files.  That's crazy.  How could this possibly benefit me?
Chris in Edmonton Send private email
Friday, September 09, 2005
 
 
"Out of habits for which I cannot recall the source (whihc is why I am asking here), I prefer to put one and only one function/method per source file."

Holy cow, you must have big functions/methods!  I don't think I have a single function or method that's more then a page.

I have one class per file.  For functions, the source files are grouped by purpose.
Almost H. Anonymous Send private email
Friday, September 09, 2005
 
 
I do one class per source file.

But I also do one standalone function per source file.  By doing this, I strongly discourage myself from making very many standalone methods.  (I usually end up with 3 times as many classes as standalone methods.)

I used to group standalone methods by purpose but the groups always drifted too much, causing me to reorganize and recategorize all the time.
Daniel Howard Send private email
Friday, September 09, 2005
 
 
One method per file will, among other things, REALLY slow down compile times, won't it? Every single file has to re-open and re-parse every header file.
Chris Tavares Send private email
Friday, September 09, 2005
 
 
Yes, this can radically slow down your compiles unless you are using precompiled headers.  I think precompiled headers should bring you back up to well within same order of magnitude as if you had not broken up the source files into one per method.

Most modern C++ compilers support precompiled headers.

Given that you are using precompiled headers, it is theoretically possible for this to SPEED UP your compiles.  If you are often modifying methods and recompiling, this method would lead to only recompiling the specific methods that changed, rather than recompiling the whole file.  Visual C++ already can do this even with all methods in the same file but most other compilers don't.

I dread to see how this would affect your link step.
Chris in Edmonton Send private email
Friday, September 09, 2005
 
 
"I also do one standalone function per source file.  By doing this, I strongly discourage myself from making very many standalone methods."

Why?

Making it artificially inconvenient for yourself to use standalone functions is like a carpenter who refuses to buy nails: instead, whenever he needs to nail something together, he takes a screw and files off the thread.  "By doing this," he says, "I strongly discourage myself from overusing my hammer."
Iago
Friday, September 09, 2005
 
 
Old Unix C style was one function per source file, with a header file per set of functions.  This enabled the recompile and link using Make to do the minimal work necessary to bring in a modified function.

Personally, I hate this style.  I prefer my functions to be collected in a 'source library' -- so the file is the 'source library', and as many functions as needed to implement the 'library' function are put in there.  One header file per 'source library', so it's really two files -- a header .h and the source .c

This enables a 'hierarchy' approach toward organizing the source code.  And doing one 'class' per file (with all its methods in the file) is similar to my 'source library' approach.  One 'class' per file is mostly the Java way, too.

One drawback to the 'source library' is that when you change any function in the library, the entire library has to be recompiled.  This doesn't bother me much, actually.  It's human comprehension which is difficult -- having the machine recompile code which hasn't been modified is simple.

Now I've also seen the "one function per file" lead to multi-thousand line functions, too.  Now THAT's bad coding.
AllanL5
Friday, September 09, 2005
 
 
If you are using an IDE it doesn't really matter.

I like one class per file simply so I can navigate the file system. Other than that I generally don't care how much stuff is one file because anything more than a few lines is hard to navigate with an editor anyway.

In C code that doesn't have an organizing principle anyway it doesn't really matter.
son of parnas
Friday, September 09, 2005
 
 
It's a good thing when the editor let you go directly to the functions defined in the file, using a listbox/treelist or anything else.

It's a thing I miss when I have to switch between different editors.

Compare "oh, function X is somewhere inside this file, let me find it, let's do a find" vs. "let's go to the listbox and choose X name, here it is".

I guess it has got something to do with the subect.

By the way, I prefer 1 file = n functions = 1 idea/concept/public class, grosso modo.
Ross Sampere Send private email
Friday, September 09, 2005
 
 
Standalone functions are generally undesireable (to me).  I'd much prefer to create a class method.  I find classes encapsulate functionality better than standalone functions and, when it makes sense, are preferable.

I admit that it's a choice.  Some programmers may prefer lots of standalone functions.
Daniel Howard Send private email
Friday, September 09, 2005
 
 
The OP is not going far enough. In addition to one function per file, he must also adopt the 1-4 lines of code per function protocol. Each function should have a really long name that describes exactly what it does.

As long as you produce a separate design document for each function-file, this is a good method.
Rich Rogers
Friday, September 09, 2005
 
 
> I prefer to put one and only one function/method per
> source file ... Understanding the code in smaller chunks
> at a time, modularity, and easier code reuse come to
> mind.
>  -- Dan White


How does SEPARATING related class or module functionality into multiple files make for better "understanding" or "easier code reuse"? :

1) Multiple, gratuitous files requires more work to understand what project files are logically related.

2) Even if all related files are maintained in the same directory, the RELATIONSHIP of the functions to one another, even in the same module, is NOT as apparent as when functions are logically ordered in the SAME file.

3) Each function in a separate file means that every function MUST be global.  There can be no private, internal-module functions only.  And although this may be great for external testing, this is an API nightmare to document so that developers know which functions are to be used globally & which should NEVER be used because they are internally-used functions only.



> I like to have one header file and one source file per
> class.
>  -- Chris in Edmonton

> I have one class per file.  For functions, the source
> files are grouped by purpose.
>  -- Almost H. Anonymous

> I prefer 1 file = n functions = 1 idea/concept/class.
>  -- Ross Sampere

> I prefer my functions to be collected in a 'source
> library' ... as many functions as needed to implement
> the 'library' function are put in there ... This enables
> a 'hierarchy' approach toward organizing the source code.
>  -- AllanL5


I program embedded systems & have developed 50 or 60 different modules in C over the last 7 years.

Each module & all its functionality is typically implemented in 2 files, but may be up to 5 files :

  module.h        header file
  module.c        main module source file
  module_bsp.c    application-board-specific porting file
  module_tbl.c    application-module table file
  app.c/bsp.c      applications file(s) sometimes include
                      module_bsp.c-type functions


But the application-independent logic for each module is located in the header file & main source file.

Each module/file should include as many functions/functionality as it needs.  Whether a module has 4 functions or 100, if the module is logically sound, then encapsulate & organize the functions in a single file.



I may be at the opposite extreme end of the OP's argument.  Some of these embedded modules have easily achieved 10,000; 15,000; or 20,000+ lines with 100+ functions.

Some of you may *gasp* thinking I'm a fool for such "large" files.  But the functionality in terms of file size is the same regardless of whether it's in the same file or not.

The alternative is to split functionality into multiple files.  But why?  If one function is a state handler that handles multiple states each with their own functions, why have N files each with a state handler?  Organize the functions in one file, with the main state handler function first followed by each state's handler function in some logical order.


> If you are using an IDE it doesn't really matter.
>  -- son of parnas

Modern IDE/editor tools are more than sufficient to easily, browse, view, search, grep, etc. source files of 10,000s of lines.



Lastly, my argument also extends to function length.  Each function should do what it requires with few if zero side effects.  Whether a function requires two lines of code or 1000, functions should NOT be arbitrarily short JUST for the sake of a cliche.

A function to parse a set of parameters may be 30 lines; a function to inspect a TCP transmit queue & transmit functions as necessary may be 300 lines.




The extreme proof-by-induction conclusion to combining the two cliches "each function should be defined in its own file" and "functions should be short" leads to 1,000s of files each with "short" functions calling many more 1,000s of "short" functions/files!


I definitely prefer the alternative, use a decent IDE editor to manage projects with 10s of files & 100s of functions.
Ian Johns
Saturday, September 10, 2005
 
 
> Some of these embedded modules have easily achieved 10,000; 15,000; or 20,000+ lines with 100+ functions.
> Some of you may *gasp* thinking I'm a fool for such "large" files.

I did gasp. I don't think you're a fool. An evil genius, maybe, but not a fool.

How on earth do you test code like that? You say you work on embedded systems, so I assume that you don't do all your testing on hardware.
Chris Steinbach Send private email
Sunday, September 11, 2005
 
 
> How on earth do you test code like that? You say you
> work on embedded systems, so I assume that you don't
> do all your testing on hardware.
>  -- Chris Steinbach

For my latest development, I used a combination of simulation unit tests but also real-time debugging on a target with breakpoints on code that could be stepped-through in debug-time without changing the behavior of the system or outputs of real-time variables to a display.


Every intelligent or experienced developer should recognize the difference between code that is real-time-invariant & code that is real-time-dependent.

For instance, a function which receives N parameters & outputs M results can easily be tested in simulation (unless the function is non-reentrant, but EVERY function should be made reentrant if at all possible).

For functions whose values require real-time inputs/outputs, it may be harder to test without introducing testing artifacts, but real-time output or logging of variable values & code events can sometimes adequately test the code correctness & coverage.


Regardless, there is always some vector of N cases to test, whether in simulation or real-time.

If N is finite & small, test all cases.  If N is moderately large, divide the cases & test the most common or those that can logically be grouped into a small subset of cases.  If N is huge, do your best.  :)
Ian Johns
Monday, September 12, 2005
 
 
I would like to thank all of you for your input.  The points you raise caused the old Grey Matter to percolate quite a bit.

I want to make a few clarifications on my original statement and I also want to respond to a few of your comments:

First of all, I find myself in total agreement with the "one header and one source file per class" concept echoed by many if not all. 

AllanL5 hit the nail on the head about the source of my delusions:

"Old Unix C style was one function per source file, with a header file per set of functions.  This enabled the recompile and link using Make to do the minimal work necessary to bring in a modified function."

I learned C under a Make-guru who pounded minimal necessary recompile into me with very large and heavy objects.

The project I am currently working on is that mutant mix of C and C++ I have come to loathe.  Some proper C++ classes, some Groups-of-C-Functions-Trying-to-Act-Like-a-Class, and a bunch of poorly organized C functions.  A tangled mess that has evolved into its current state over a period of 10+ years.

Almost H. Anonymous said
"For functions, the source files are grouped by purpose."

and Ian Johns said:
"Even if all related files are maintained in the same directory, the RELATIONSHIP of the functions to one another, even in the same module, is NOT as apparent as when functions are logically ordered in the SAME file."

I concur, guys, but as a resul;t of being patched and re-p[atched over the years, there is no visible logic or purpose to the order or organization of the pieces of these files.

Ian further points out:
"Each function in a separate file means that every function MUST be global.  There can be no private,
internal-module functions only."

True, but I would choose that over identically named (and almost identically functional) local functions appearing in four separate places -- no lie. 

Bottom line: this project is in serious need of refactoring/re-engineering, but no one wants to take the time/effort/$$$'s to do it.  Most frustrating.

Again, thank you all for sharing your thoughts.  I resolve to "unclench" and group functions/methods in a purposeful and logical manner.
Dan White Send private email
Monday, September 12, 2005
 
 
> I resolve to "unclench" and group functions/methods in a purposeful and logical manner.

If they weren't built in a purposeful and logical manner it's unlikely they can be regrouped in purposeful and logical manner.

Move on to something else.
son of parnas
Tuesday, September 13, 2005
 
 

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics
 
Powered by FogBugz