The Design of Software (CLOSED)

A public forum for discussing the design of software, from the user interface to the code architecture. Now closed.

The "Design of Software" discussion group has been merged with the main Joel on Software discussion group.

The archives will remain online indefinitely.

Error codes

I'm currently in a phase where I'm bringing a bit of order in my application's error reporting system. I'd like your thougts on your preferred return error codes approaches.

Now, my application works as a kind of server. Clients submit requests, requests get processed, and then clients get some sort of a response. Sometimes, that response is only "OK." Sometimes, that response includes some return data. And in all other cases, the response is some form of error report.

The clients are not tied to any specific platform or standard. The only thing they have in common is that most (but not all!) of them accept XML. That XML is communicated to them in a variety of protocols, from SOAP and XML-RPC to HTTP post or even plain socket connections.

I'm trying to build an error reporting system that is consistant for all clients and connection methods.

Right now, my app always returns error code and error message. 100/"More or less detailed message about what exactly was done" is an example of successful operation. Everything else is an error.

Error code with error message, first for the computer and other for human, seems to be most universal.

One code for "OK" seems fine - in that way, client knows if it worked or not. Client can implement error handling scenarios later. On my side, I can add more errors without fearing that I'll break something on client side. It seems to translate pretty good to transports like SOAP which have clearly defined error handling.

But it's not enough.

For starters, I'm not sure if my "code" should be fixed number of digits or not. I'm not too confident in "100 equals OK" HTTP-like model - maybe "0 means OK" is better. Should I limit myself to positive error codes, or should I include the negative as well?

And then, I really need warnings. OK/not OK is just not good enough. I need a way to tell "everything went fine, but watch out, it may not be what you wanted".

And of course, it would be nice to have a difference between the implementation errors (where clients submit improperly formatted requests) and service errors. And, of course, since my app can have some sort of plugins, it would be great if I can reserve a range of codes for plugins.

Though I might live without several errors in one response. Maybe.

But I'm absolutely positive that I want to keep it as simple as possible.

What error models did all of you find easiest to work with?
Domagoj Klepac Send private email
Saturday, February 25, 2006
I like a model I learned from my days on an IBM mainframe, which I still use today, in one form or another. The numbers here aren't as important as being consistent in their use.

First, define a standard set of status codes, such as:
0 - no error
4 - warning
8 - error
12 - critical error

Next, each error/message has an ID and corresponding text,
0 - Transaction completed successfully,
1 - Insufficient funds,
2 - Out of memory,
3 - Fatal system error,
which you store in a file/database.

Each function you write requires an output argument to hold the error/message ID. Functions return one of the status codes, and value the output argument with the error/messasge ID. The caller evaluates the status code, and can then display a message if needed by looking up the appropriate message for the ID. You can also require two output arguments, one for the ID and one for the message, or if you're using a language that lets you pass structures or objects, you can have a single argument that lets you get at the ID and message.

Having a consistent set of status codes lets your clients decide how to react after calling your function. The message ID numbers makes it easy to find where in the code a message is generated. It also lets you change the message text (for localization, for instance) without having to change or re-compile code.
Former COBOL Programmer Send private email
Saturday, February 25, 2006
Any reason not use exceptions and turn each error code into an appropriate exception?
son of parnas
Saturday, February 25, 2006
I like the IBM mainframe approach - that's exactly what I'm interested in, different approaches that worked for you. I'm actually thinking about somehow storing the two pieces of information in the "error code". For example, a variation:
0: OK
1-1000: Warning
1000-2000: Invalid request or request parameters
2001-3000: Error
3001-4000: Fatal error

...and then the message itself remains something which is displayable to the user.

Son of parnas, I'm aware that the error message is convertable to exceptions, but not all of my clients (or wire formats) can handle exceptions. So I'm transforming my exceptions to something more universal. If I'm talking Java on both sides, and it's Java RMI on both sides, then eventually I'll throw a RemoteException. But in most other cases, I'll have to populate some fields with some data.
Domagoj Klepac Send private email
Sunday, February 26, 2006
I agree with Former Cobol, probably because it has worked for so long.  If possible, I would also suggest that you set aside one extra byte for messages, either using 1000-9999 as a warning, 10000-19999... 

_My preference_ is to use a one character message type (E - error, W-warning, etc.) this allows me to see in a second how bad things are and I get to use E0001-E9999, W0001-W9999, etc. Plus should I need to add something later, a new message type gives me another range to work with (D0001-D9999).

0 (zero) should always be - everything went perfect. I also avoid negative numbers because many databases use those and it just gets confusing when -811 can mean two things.

One statement you made that concerned me is:"everything went fine, but watch out, it may not be what you wanted"

IMHO - this should never happen. If you knew enough to know you could not be sure - you have an error. Or it is a warning. (Perhaps I just need an example)

I am undecided when it comes to putting messages into tables.  I like that you can change the message without changing code - BUT, it also means that when you look through code and want to know what W1132 means you need to look it up in the table.  If you do put the messages into a table, be sure to document what 1132 means at the point you use it.

Finally - do everyone a favor and never put the same message down for unrelated events, even if the problem is the same.  E0001 - on database read, reports a database error.  Putting E0001 on every database read is just punishing the person trying to debug a problem. [and it might be you :) ]  Even when you grep for it, it comes back 611 times on all 134 tables you have.  At the vary least make each database/file access a unique error.
Monday, February 27, 2006
I have never seen the point of using codes that need to be looked up somewhere else to make them meaningful.

OTOH, long wordy descriptive messages are no good if you need to test their values in code.

So, I like the 'exceptions approach' of using 'codes' that are self-documenting, as in 'NullPointerException'.

Don't forget for some users il8n will be an issue, so whatever you do you may consider having a way to translate the descriptive text before the end user sees it.
NetFreak Send private email
Tuesday, February 28, 2006
Regarding "everything went fine, but watch out, it may not be what you wanted", I actually meant to describe classic warnings. Something in the lines of "The customer was billed, but his IP shows that he's from Nigeria, so you might want to run some additional checks and eventually cancel transaction".

But I agree with you on that one - I've recently seen a system which takes customer data, amount and a number of installments, and if installments are not approved by the bank, automatically bills the customer for the whole amount. So the customers would try to buy something, and usually the transaction was declined because they didn't have enough money because a system would try to charge the total, ignoring the installments. Nuts.

As for the error messages in database, that might be convenient when you want to report errors in several languages. But for the sake of simplicity, I usually tend to have English logs and debugging errors even in multilingual apps, even though English is not my native language - only the UI and error messages that user sees get translated.

I'm usually against non-numeric error codes. It's sometimes useful to have something like:
if (error > 1000) ...

...and avoid having to extract the number from something like "E1234".
Domagoj Klepac Send private email
Tuesday, February 28, 2006
Here's one way that I use the lookup of numeric codes: Each code has two messages associated with it, a message that is displayed to the user (external message) and a message that is logged internally (internal message). The external message uses simple, easy to understand language and avoids technical jargon. The internal message contains all the bit-head details you need to troubleshoot the issue. I keep the codes and messages in a database. If needed for multiple languages, I have multiple external messages related to the code, one for each language (internal messages are always in English). I can then generate a message store (resource file, flat text file, etc.) or just have my app hit the database.

Not perfect, but it works...
Former COBOL Programmer Send private email
Tuesday, February 28, 2006
My thoughts on this are as follows.

I never keep error messages outside of code logic such as in a database because I want to be able to see the logic that produced the error.

Secondly, my code should not be handling exceptions.  A solid design accounts for all user input for a particular field.  When reports from the user community come in that this particular error came up when I put "1 2" for the customer record I go and find out why my code tripped a default exception handler.  I then update my code to handle the previously unhandled data format; which, in this case, is very rare for me.

In my development environment, I use fill fields that have pattern checking done right on the field.  Any errors I generate in code are for things like: "Customer record already exists" type things.  And, as I stated, I want those types of error reports in the logic that generates them.

  SQL->Active = true;

  if( SQL->RecordCount )
    // do something
    ShowMessage( "Customer Id already exists .......");
Eric (another ISV guy with his company)
Tuesday, February 28, 2006
If you're dead set against exceptions, I personally really like the HRESULT return value used by COM code under Windows.

Basically, it's a 32 bit value which has various bit-fields for particular purposes.

Bit 31 is the success/fail bit. This is also the sign bit in 2-complement, so all negative values are failures, non-negative are successes.

There's 11 bits for a "facility code", basically saying what subsystem produced the error.

And the low 16 bits are for the actual failure code.

The net result is you have a fairly easy way to define lots of different error codes, and they're all fairly easy to work with. Typically, you'll define a macro/constant for each unique error code, and use the SUCCEEDED and FAILED macros to figure out to do. Something like this:

  HRESULT hr = GoDoSomething( );
  if( FAILED( hr ) ) {
    switch( hr ) {
      ... whatever ...



Anyway, that's the basic idea.
Chris Tavares Send private email
Tuesday, February 28, 2006
Oh, and HRESULTS support the "warning" semantics by having multiple success codes as well as multiple failure codes. So S_OK is 0, and means "everything worked fine", S_FALSE is "Everything worked, but the result is logically false", and you can define whatever success codes you need.
Chris Tavares Send private email
Tuesday, February 28, 2006

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics
Powered by FogBugz