The Design of Software (CLOSED)

A public forum for discussing the design of software, from the user interface to the code architecture. Now closed.

The "Design of Software" discussion group has been merged with the main Joel on Software discussion group.

The archives will remain online indefinitely.

Rant on Excel

<rant>
Folks in my industry seem to think that Excel makes a dandy vehicle for passing data from desk to desk, and company to company. In fact, it sucks for that purpose. One of the characteristics you want most from a data transfer system is that the data emerge unchanged at the far end. You can't count on this with Excel. Excel changes too much, and makes too many assumptions about the data.

The problems are shared by Excel per se, and the Excel database drivers.

I spend a couple hours each week preparing Excel files before reading the data into my SqlServer database.

The most obvious problem is that Excel will change text to numbers, causing leading zeros to disappear from SSNs and aip codes. There are also problems getting dates recognized as  dates. If the top rows of an date column are blank, the Excel drivers never figure out the column is a date column, even if you gave it a date format.

If you have text '0011' and export to csv, it gets saved without quotes, so when read into Excel, it looks like a number and the leading zeros are dropped. all sorts of problems like these.

I'm not an Excel guru, so maybe there are ways around these problems. It should work better than it works for me.

</rant>
Nutmeg programmer
Wednesday, January 09, 2008
 
 
Indeed, Excel is a tool of Satan.

It should never be used as a data source, a database or anything other than a spreadsheet.

Unfortunately, it is the pinnacle of many users' computer savvy (of which they are proud) and they will insist on perverting it to perform any and all tasks.
Cade Roux Send private email
Wednesday, January 09, 2008
 
 
Do not forget the problems with local preferences: dates that are reinterpreted from American to European or v.v. and decimal separators that are reinterpreted as thousands separators.

Horror.
Karel Thönissen Send private email
Wednesday, January 09, 2008
 
 
Ah yes, dates in Excel. I have gotten into the habit of almost always forcing them to be text fields, so I don't have to deal with Excel randomly guessing which number is the month.
Greg Send private email
Wednesday, January 09, 2008
 
 
Amen brother.  It is the tool of Satan.
JimK Send private email
Wednesday, January 09, 2008
 
 
"It should never be used as a data source, a database or anything other than a spreadsheet."

Heretic! Torture Excel until it confesses to be a database. Then burn it at the stake.

There are more funny things people can do with Excel, e.g. build large table systems with links to other Excel files -- absolute, of course, including the drive letter. Works PER-FECT in an environment where these drive letters are dynamically mapped by a logon script, depending on the current location.
Secure
Thursday, January 10, 2008
 
 
I think you’ll find that only part of the guilt lies with Excel. Depending on how you’re loading the Excel files into SQL Server (DTS for example), then part of the guilt also lies with the data connection technology.

Both DAO and ADO, for example, make assumptions about the nature of the data when reading an Excel file. They will read in a limited number of records and make an assumption about each columns’ data type based on that small sample size. This can be addjusted but it’s not perfect.

A couple of suggestions and things to be aware of:

* Where possible – you (yup, you) supply the templates that users should fill in for any data to be loaded in to SQL Server. Hey you’re the DBA, if you’re responsible for data quality in your system you should have a say in this.

* Format any cells intended to contain strings as Text – this will tel Excel to retain the leading zeroes. You could also have code behind the sheet prefix  each entry with an apostrophe (which explicitly tells Excel that this is a string entry)

* When an ETL tool uses ADO or DAO to extract data from an Excel workbook, the cell formatting is irrelevant and completely IGNORED. Formatting dates a particular way makes no difference. Remember the ETL tool is not opening a spreadsheet with formatting – that’s at the presentation layer – it’s reading the underlying data.

* Ensure all dates are entered (preferably as strings) in DD-MMM-YYYY format to remove any ambiguity during the SQL load.

* Use Excel Data Validation. However, this will only apply basic validation to entered data. It wont capture omissions which will result in NULLS during the SQL load.

* Alternatively prepare the templates to save data in another format (XML, comma or tab delimited). It’s much easier to export the data as formatted on the worksheet at this stage. More upstream work means less downstream headaches.

* So you’re aware, internally Excel stores dates as numbers where the number 1 represents 01-JAN-1900 and increments 1 for each day ever since.  Today is day 39,457.

* Educate your users as to what Excel is and what it is not.

“It should never be used as a data source, a database or anything other than a spreadsheet.”
The distinction between the two has always been a blur and is becoming more so with Excel now supporting 1 million rows.

Unfortunately this seduces many users in to believing that Excel is a REAL database.

“It is the tool of Satan.”
Not really – it’s the tool of people who have to get stuff done (preferably with the IT department getting involved. Excel is a fantastic tool with great flexibility and programmability. I challenge you to find a bigger COM object than Excel (in terms of sheer number of properties and methods).

Unfortunately the flexibility and versatility comes at a price when implemented without discipline or forethought.

Good Luck
Marcus not from Melbourne
Thursday, January 10, 2008
 
 
@Secure,

Excel does support UNC paths
Marcus not from Melbourne
Thursday, January 10, 2008
 
 
George Jansen Send private email
Thursday, January 10, 2008
 
 
I wrote some reporting addons to move data from excel to create slides in powerpoint.  This damaged the part of my brain that would make me feel too guilty to admit this.

The same people that used excel as their database needed to result reported in powerpoint so they could see if one bar was bigger than the other, which meant "buy more hardware".
Lance Hampton Send private email
Thursday, January 10, 2008
 
 
Even if you have date columns formatted properly, try entering a date before 1900-01-01.  You probably won't get the result you expect.
anonymous
Thursday, January 10, 2008
 
 
We deal with about 1/2 a million part numbers across our network of partners, and about 1/10 of those part numbers begin with leading zeros.

Excel can be a great tool, however, for me it is EVIL.
Mystic Llamaherder
Thursday, January 10, 2008
 
 
"* Ensure all dates are entered (preferably as strings) in DD-MMM-YYYY format to remove any ambiguity during the SQL load."

Personally, I like dates in ISO 8601 format.  Today is 2008-01-10.  So when I have a spreadsheet with dates, Excel gives me grief.

Sincerely,

Gene Wirchenko
Gene Wirchenko Send private email
Thursday, January 10, 2008
 
 
Excel does what you tell it too, but it won't read your mind. Tell it what you want.

This is what you get for using Excel as a proxy for a development environment.  You will have to "program" it.

Now leave us!  And only come back once you have masted your tool!
Lothar of The Hill People
Friday, January 11, 2008
 
 
OK, Lothar, tell me how to have cells that have date values that are entered in ISO 8601 format and displayed in that format.

Sincerely,

Gene Wirchenko
Gene Wirchenko Send private email
Friday, January 11, 2008
 
 
Hi Gene,

I’m assuming your question is serious – this is very easy to do.

1) Highlight the cells you want to format
2) From the Format menu choose Cells, the Number tab and the Custom option
3) Enter “yyyy-mm-dd” (without quotes) in the Type field and click OK.

I had no issue entering 2008-01-14. Excel recognised it. If you looking in the formula bar through, Excel still reads it as a regular date.

I’ve taken my interpretation of the ISO 8601 format from http://www.w3.org/TR/NOTE-datetime/.

Regards -
Marcus from London
Monday, January 14, 2008
 
 
For a non binary import/export format, use SYLK rather than CSV - it will preserve the leading zeros, because it encodes numeric and string fields differently.  It will also apply a format to a field if you desire, so you can have your ISO formatted dates.
Mark Ransom Send private email
Monday, January 14, 2008
 
 
==> Folks in my industry seem to think that Excel makes a dandy vehicle for passing data from desk to desk, and company to company. In fact, it sucks for that purpose.

What make you think that this is limited to your industry?

As far as I'm concerned anyone in any kind of business with a PeeCee on their desk thinks this way. It's scary. There's even IT folks who follow this line of thinking.

Be afraid.
Sgt.Sausage
Monday, January 14, 2008
 
 
Marcus from London: "Hi Gene,

I’m assuming your question is serious – this is very easy to do.

1) Highlight the cells you want to format
2) From the Format menu choose Cells, the Number tab and the Custom option
3) Enter “yyyy-mm-dd” (without quotes) in the Type field and click OK."

It was quite serious.  While the above does allow for a date to be entered and displayed in ISO 8601 format, the =bar displays the dates in this format: 14/01/2008, so if I want to edit an existing date, it is awkward.

Sincerely,

Gene Wirchenko
Gene Wirchenko Send private email
Monday, January 14, 2008
 
 
Gene,

If I go to the Control Panel/Regional Settings and set my date format to ISO, Excel(97) respects that format and displays it in the =bar. This is exactly what I would expect a well-behaved application to do.
Mark Ransom Send private email
Monday, January 14, 2008
 
 
OK.  Now, it is working.  Not very friendly.

I also see why I did not get dates right in the first place.  Silly me.  I thought that I should format a date as date (not custom).

Sincerely,

Gene Wirchenko
Gene Wirchenko Send private email
Tuesday, January 15, 2008
 
 

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics
 
Powered by FogBugz