The Design of Software (CLOSED)

A public forum for discussing the design of software, from the user interface to the code architecture. Now closed.

The "Design of Software" discussion group has been merged with the main Joel on Software discussion group.

The archives will remain online indefinitely.

Software for Planetary Probes

I'm fascinated by the role software success and failure plays a role in the sucess and failure of space missions.

The failed Mars Climate Orbiter, for example, is thought to have prematurely shut down its engines because of a software design that was not well thought out:

"The underlying cause [of its failure was] inadequate software design and systems test."

There are lots of other stories of how software brought down space failures.

But there are not so many articles about the architecture of systems that actually worked! That is the interesting part.

The Mars Rovers right now are running on *off the shelf* microprocessors. Instead of exotic hardware redundancy, the systems are designed to be able to recover from catastrophic failure. If bad stuff is detected, it stops, shuts down, and pings earth, awaiting futher instructions or a new operating system. This involves being able to tell when an error condition has occured, and being able to get to this recovery mode no matter what else has happened. An interesting design and one that has resulted in Rovers still going strong more than a year after landing even when they were designed to last only for 3 months. Who knows, maybe they will last as lonk as the Viking orbiter did. Designed for a 120 day lifespan, it worked for 8 years:

I think that will space hardware, getting really good dedicated people to work on the software is crucial. People who can account for every possibility and who can set up systems to recover sanely from the possibilities they didn't think of.
Saturday, February 26, 2005
True.  In fact, the Rovers are running VxWorks (for the Power PC, I believe).  And they are a success story.  Jack Ganssle has the full story, if you want to read it, on his website. (

Well, I've googled for a better reference, and can't find one.  Basically, they had a problem with full memory and interrupt priority inversion.  They changed this, on Mars, from Earth, uploaded new firmware, and the problem was fixed.
Saturday, February 26, 2005
What you say is true, I worked on projects like this a few years ago. But the overriding factor is the *huge* amount of money spent in every phase of the development cycle. Cost overruns of hundreds of millions of dollars were not unusual, you don't want to know what the original budget was (your tax dollars at work...)
Anony Coward
Sunday, February 27, 2005
You might find this book interesting:

"Fatal Defect" by Ivars Peterson
Sunday, February 27, 2005
Read some of Jack Ganssle's stuff.  A lot of his article chronicle high-profile software failures.;jsessionid=4DMJM05FSQNOIQSNDBCSKH0CJUMEKJVN
Myron A. Semack Send private email
Monday, February 28, 2005

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics
Powered by FogBugz