The Design of Software (CLOSED)

A public forum for discussing the design of software, from the user interface to the code architecture. Now closed.

The "Design of Software" discussion group has been merged with the main Joel on Software discussion group.

The archives will remain online indefinitely.

The Problem With NULL

I'm certain others will address this topic more eloquently than I...  But here it is, for what it's worth:

The data modeling perspective: As a matter of practice, NULL (where used) should be indicative of something specific that cannot be described better by an actual valid value for the particular data type – even if the indication merely conveys the absence of something.

From a programming perspective, NULL is problematic (Note: NULL is not the same thing as a zero-length string or a numeric value of zero)…  Normally, it takes a significant amount of code to handle NULL values since string values have to be tested for both NULL and for zero-length, and numeric values have to be tested for both NULL and zero, etc…  In fact, certain kinds of variables can not even be safely compared in code without first testing for NULL…  Failing to do so will cause runtime errors when the variable is NULL.  One problem with this is that buggy code can pass testing as long as no NULL values were encountered in the test data.  Unless NULL has a different (and specific) meaning other than empty string or zero (respectively), it is not worth the trouble.

Further, code is much easier to construct (and more importantly, read) when data is modeled such that programming for dual/multiple meanings is not necessary.  Introduction of the NULL possibility can have the affect of doubling the conditions that must be handled in code with respect to a particular data element…  But imagine the consequences when one or two or three other data elements (all of which may also be NULL) are introduced into a specific set of logic in which all values must be compared and the various combinations of values are significant to the logic path.  The number of conditions that must be handled can increase exponentially.

In summary, when relationships and constraints can be applied to ensure data integrity, deterministic computing is much easier to achieve.  In other words, the integrity of the code is not at the mercy of the data when the data is maintained in a concise, descriptive form that requires very little programmatic interpretation.
Leonard Lewis Send private email
Friday, May 11, 2007
 
 
What are you saying?

Is it a bitch to handle NULLs? - Yes

Are NULLs necessary? - Yes

so deal with it...
Bob
Friday, May 11, 2007
 
 
I wish you had clarified you were talking about a database null (usually shown lowercase at least in Oracle), not C/C++ NULL which is just zero. I read 3/4 of your post without realizing that.
Ben Bryant Send private email
Friday, May 11, 2007
 
 
Are you are arguing that the concept of null is an unnecessary complication in today's databases?

Our code almost never tests for null, and once every few years a record gets a null in it somehow, and something barfs. I'm not exactly proud of it, but in the grand scheme of things, there are a million things I worry about a lot more than nulls.
Greg Send private email
Friday, May 11, 2007
 
 
The use of NULL is not inherently disintegrous, although it can certainly be abused unnecessarily.  Null is necessary in some situations for closure, just as Zero is necessary for closure in algebra.  Before completely denegrating NULL, consider that Zero was one of the greatest inventions of mathematics since One, warts and all.

One may wish common high level languages handled NULL as gracefully as algebra handles Zero, but we will never be rid of the NULL check just as we will never be rid of the divide check.
Mikkin
Friday, May 11, 2007
 
 
This comes from an incomplete analysis of your NULL situation.  If NULLs bite you in the butt regularly, then either you are using them where you didn't intend to or you haven't defined your behavior for NULL cases.

Remember, NULLs don't mean 0 or "".  That's why you can't add a NULL numeric column with impugnity.

This works best by example. 

Case 1:  an application records the monetary discount amount of a transaction in the discount column.  When the system was implemented five years ago, discounts were not recorded in the system and some of the existing code from five years ago is still running.  New code fills discount with $0.00 if the transaction was made at full price.  If you are looking for a total discount for a set of records, it would be appropriate to treat NULLs as zeros.  You can do this either in the SQL, or in the code.  To do it in the SQL, use the appropriate NULL replacement function (MS SQL -- ISNULL, Oracle -- NVL, etc...).  To do it in code, either use some sort of if test, or a nifty language feature like discount.GetValueOrDefault(0) from .Net 2.0.

Case 2:  An application records each return of a cash drawer.  When a drawer is given to a cashier, the DrawerTakeoutAmount column has the initial amount seeded in the cash drawer.  During the original return, a record is created but with the DrawerReturnAmount column NULL.  When a manager counts the drawer, he/she puts in the register receipt amount and the DrawerReturnAmount.

If you want the average shortage for a given cashier, NULLs can make this easier.  In SQL:

SELECT
  AVG(DrawerReturnAmount - (DrawerTakeoutAmount + TotalSales))
FROM
  CashDrawers
GROUP BY
  CashierID

The query completely ignores uncounted drawers.  To do this in code, you would have to do special processing, hence the need for NULL checking.  I would recommend choosing a language the supports Generics so this can be a compile time error instead of a run time issue.  .Net 2.0 has a Nullable(Of Decimal) data type behaves properly with regard to your concern.

This code can throw an exception if one of the fields are nullable:

Dim x As Integer = dr("Col1") + d("Col2")

However, if you wrap your data in these properties:
---------------------------------------------------
Public Property Col1() As Nullable(Of Integer)
  Get
    ...  put code here to retrieve data ...
  End Get
    ... Set removed due to not wanting to bother ...
End Property

Public Property Col2() As Nullable(Of Integer)
  Get
    ...  put code here to retrieve data ...
  End Get
    ... Set removed due to not wanting to bother ...
End Property
----------------------------------------------------

Then, this code won't compile:

Dim x As Integer = obj.Col1 + obj.Col2

This will:

Dim x As Integer = obj.Col1.GetValueOrDefault(0) + obj.Col2.GetValueOrDefault(0)

This will too:

If obj.Col1.hasValue Then
  ' Do something with data
End If

So, the richer data types allow you to see your NULL issues before they become runtime errors and protect you from shooting yourself in the foot.
JSmith Send private email
Friday, May 11, 2007
 
 
NULL is less than ideal because it's metadata that's treated as data.
dev1
Friday, May 11, 2007
 
 
How did nulls get in your Strings?
Did you neglect them?
Assignments must yield values.
The Spirit of the Electoral College Send private email
Friday, May 11, 2007
 
 
We don't allow nulls in our database as a general standard.  Dates are the exception, though.  You can't have a zero date on a date data type.

This topic is always good for a flame war.
OneMist8k
Friday, May 11, 2007
 
 
> Normally, it takes a significant amount
> of code to handle NULL values since string
> values have to be tested for both NULL and
> for zero-length, and numeric values have to
> be tested for both NULL and zero, etc…

Normally, if you have to deal with NULLs, it's because NULLs mean something.  If you're encountering NULLs that are semantically equivalent to the empty string, that's a problem.  But it's not a problem with NULL.  It's a problem with the design of your database, and it should be fixed there.

> Unless NULL has a different (and specific) meaning
> other than empty string or zero (respectively), it
> is not worth the trouble.

In other words, don't work with databases that were designed by people who don't understand NULL.  And if you find a nullable column in your database that shouldn't be, fix it.

> Further, code is much easier to construct (and
> more importantly, read) when data is modeled such
> that programming for dual/multiple meanings is not
> necessary.

Here, however, you are Procrustes shortening his guest to fit the bed.  Ease of construction and maintenance aside, code has to isomorphize onto the real world.  In the real world, more often than not there is a real, significant, tangible difference between "we know this to be nothing" and "we don't know what this is."  Code may seem easier to construct and maintain if it acts as though this distinction doesn't exist.  It's not.

Before databases had NULL, we used magic numbers.  "This value can never be less than zero, so if we store -1 in the field we'll know we haven't recorded the value yet."  What you're essentially proposing as a replacement for NULL is that we standardize on zero and the empty string as our magic numbers.

> Introduction of the NULL possibility can have
> the affect of doubling the conditions that must
> be handled in code with respect to a particular
> data element…

Yes, because *those conditions exist and must be handled*. 

You bemoan the fact that the presence of NULLs in your data is resulting in code that appears to work but really doesn't.  That's true.  But it's not because there are NULLs in your data.  It's because one of two conditions are true:  1) Your data's bad.  2) Your code's bad.

If a column's nullable, and it shouldn't be, fix the column.

If your code's not dealing with a column that's nullable, and the column should be nullable, fix your code.

To me it seems as if you are complaining that because you keep having to replace burnt-out fuses, and because electrical circuits are much more straightforward if they don't contain fragile components that can burn out under heavy load, we should really rethink the idea of fuses.
Robert Rossney Send private email
Saturday, May 12, 2007
 
 
I think that most of you are misunderstanding the original poster.  I am making a programming language in which pointers cannot be null.  However you can make something nullable, kind of like in C#.  So here is my syntax:

int i //makes a variable that holds an int

int@ ip //makes a variable that holds an int pointer

int? in //makes a variable that can be null or can be an int

int@? ipn //makes a variable that can be null or holds an int pointer

int?@ inp //makes a variable that holds a pointer to a nullable int

By doing this if you see a plain pointer you know that the value won't be null.
Tom C
Saturday, May 12, 2007
 
 
I'm not sure I understand the problem.  You don't need NULLs in your app and don't want to handle them separately? 
Then just say "in this application we never have any NULLs", and use NOT NULL constraints to enforce this convention.

Just be sure not to use SQL constructs that can result in NULLs (e.g., outer joins).
Jeff Zanooda Send private email
Saturday, May 12, 2007
 
 
> NULL is less than ideal because it's metadata that's treated as data.

A very interesting way of looking at it. How would you handle what null means? Through a row variable or something?
son of parnas
Sunday, May 13, 2007
 
 
> A very interesting way of looking at it. How would you handle what null means? Through a row variable or something?

The usual way to handle this is by introducing another table. Suppose that the price of some items is unknown:

create table item (
  item_id integer not null primary key,
  item_price money null,
  ...
)


Instead, you could have this:

create table item (
  item_id integer not null primary key,
  ...
)

create table item_price (
  item_price_item_id integer not null primary key,
  item_price_price money not null
)

alter table item_price add constraint fk_item_price foreign key (item_price_item_id) references item (item_id)


Now we have no nullable columns. However, all code using the database still needs to deal with the fact that not all items have prices. In fact, the most obvious query leaves you exactly where you'd be if you'd stuck with the initial table:

select item_id, item_price_price from item
  left join item_price on item_price_item_id = item_id
clcr
Sunday, May 13, 2007
 
 
> Remember, NULLs don't mean 0 or "".

Allow me to introduce you to Mr Oracle.
Iago
Sunday, May 13, 2007
 
 
You can use the Null Object design pattern to include NULL valued objects in your designs.

http://en.wikipedia.org/wiki/Null_Object_pattern
sensible
Sunday, May 13, 2007
 
 
There is definitely a history of using null where it is not necessary. I worked with a framework that returned null for "no matching results" whether the data came from a database, webservice, cache or whatever. This forced users to do things like ...

if ( result != null )
  for( Thing thing : result )
      ...

Returning an empty collection would have been a much better choice. One opportunity to get rid of nulls.

The same vendor recommended a null transaction object to indicate a transaction already in progress so the code in question didn't have to do its own transaction mgt. Again it led to many copies of ...

if ( transaction != null )
  transaction.commit();

We replaced this recommendation with a null object as suggested in the previous post, ie a transcation that does absolutely nothing. Client code can always call commit(), simplifying the code path and insulating against accidental omission of the test.
Stan James Send private email
Monday, May 14, 2007
 
 
NULL makes sense in a database.  Probably the best way to handle the NULL in application code is through a NULL object. 

Build an interface and code two child classes from it, one to handle the case where data is present and one to handle the case where the database entry is NULL.  Create a class factory to generate the appropriate child class when reading in the value from the database. 

Investing the effort to use two child classes will usually greatly simplify processing in the remainder of the code - you no longer have constantly check if you are dealing with regular data or NULL data.
Wayne M.
Monday, May 14, 2007
 
 
OMG, NULL controversy again.

NULL lovers: just use null and be happy
NULL haters: invent NULL replacement that means "No data available here" for each occasion there is and be happy

There is no free "no data" handling.

Period.
moronica
Monday, May 14, 2007
 
 
"string values have to be tested for both NULL and for zero-length"

What am I missing here? NULL means zero length so you're only testing once...

char *Text = NULL;

...

if (Text){
xxxxx
}

That's only one check.
Jim
Monday, May 14, 2007
 
 
IMO as much as possible avoid null. If you can't. Deal with it.
j2e
Tuesday, May 15, 2007
 
 
.Null. In Databases Systems are a necessity. Because how else do you represent an Unknown. If want to get an inventory value and I don’t know the value of a particular item who do I do?? If I put 0 it is wrong.

All that said I still never allow them… It’s just too hard of a concept to explain to a user. I am item has a 0 value there inventory reports come out wrong and I explain that it is a data Issue.. “Look you forgot to assign a value to this Item”. They fix it an all is well. 

Maybe one we’ll have more sophisticated users :-)
Anon Ranter Send private email
Tuesday, May 15, 2007
 
 
Nulls in a database are simply an alternative to splitting a table and using joins.  Many times allowing a null value and not having to create and link a separate table is the easier approach.
Wayne M.
Tuesday, May 15, 2007
 
 
>>Nulls in a database are simply an alternative to splitting a table and using joins.  Many times allowing a null value and not having to create and link a separate table is the easier approach.

Nope. Sum on join with the child table having no records returns 0

Sum on a table with one value of .NULL. returns .NULL.

Lack of records will never cause return a NULL
Anon Ranter Send private email
Tuesday, May 15, 2007
 
 
"Lack of records will never cause return a NULL"

Absolute BS

Link Client table -> Order table

Results

ClientID  OrderID
1234        A344
3333        NULL
4444        B123
4444        B124
2345        NULL
9483        NULL

Clients 3333, 2345, and 9483 do not have Order records so it returns NULLs
Bob
Tuesday, May 15, 2007
 
 
I meant with a Sum()

Soory...

Tuesday, May 15, 2007
 
 
"Nope. Sum on join with the child table having no records returns 0"

That's wrong too - I just tried it with MSSQL and if there are no child records SUM() returns NULL
Bob
Tuesday, May 15, 2007
 
 
To clarify what Iago said about nulls and Oracle databases:

If you insert a zero-length string into a varchar column in Oracle, Oracle inserts a null.  This irrational behavior has cost countless hours!

I suspect the poor OP is using Oracle.
XYZZY
Tuesday, May 15, 2007
 
 
"That's wrong too - I just tried it with MSSQL and if there are no child records SUM() returns NULL "

Yuck. What bad design. That's just creating proplems. The sum of an empty column/array/list should be the identity for +, which is 0, for closure, etc.

C.J. Data would be rolling over in his grave if he were dead.
Paul Mansour
Wednesday, May 16, 2007
 
 
A reader
Friday, May 18, 2007
 
 
NULL means UNKNOWN.  if some columns in your database need to support unknown values, then they should be nullable.

NULL does not mean 0 or empty string.  Here's my favorite illustration:  I have two boxes.  One contains nothing with certainty.  The other has unknown contents.  It may contain nothing, maybe a pile of cash, maybe a dog turd.

which box would you rather have?  can you see there's a difference?
Jesse
Thursday, June 07, 2007
 
 
"I wish you had clarified you were talking about a database null (usually shown lowercase at least in Oracle),"

HUH?

..
Oracle9i Enterprise Edition Release 9.2.0.6.0 - 64bit
..
SQL> select null from dual;

N
-


SQL> select NULL from dual;

N
-


SQL> select NuLl from dual;

N
-


SQL> set null NuLlAnyCasEYouLike
SQL> select NUll from dual;

N
-
NuLlAnyCasENYouLike
Hans Anderson
Friday, June 08, 2007
 
 

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics
 
Powered by FogBugz