The Joel on Software Discussion Group (CLOSED)

A place to discuss Joel on Software. Now closed.

This community works best when people use their real names. Please register for a free account.

Other Groups:
Joel on Software
Business of Software
Design of Software (CLOSED)
.NET Questions (CLOSED)
TechInterview.org
CityDesk
FogBugz
Fog Creek Copilot


The Old Forum


Your hosts:
Albert D. Kallal
Li-Fan Chen
Stephen Jones

Is it safer not to completely remove files from hard disk?

Hello,

I've learnt today that Windows doesn't "delete" files. It only put them into an unused writeable area. Well I just wonder why? Isn't it unsecure? I've scanned a confidential document today. It was funny, we were talking about Mission Impossible and how documents were destroyed. But two co-workers stated that they can recover the file I've destroyed. Well this was for fun, but not so funny.

What do you think about this?
Tommy
Wednesday, February 15, 2006
 
 
Data recovery isn't hard at some levels, but can become very expensive very quickly at others. If you need secure deletion, just search for it and you'll find lots. You can find a free deletion tool at AnalogX.
Ryan Smyth Send private email
Wednesday, February 15, 2006
 
 
From what I understand there are three different levels of "delete":

Level 1: Windows places the file in the Recycle Bin (if active). The file is recoverable from the bin, but not available in its former location. You can tell Windows to not use the bin, or you can delete files from the bin.

Level 2: "Permanently" delete. You can "permanently" delete a Windows file by either deleting it from the Recycle Bin or holding down <SHIFT> while deleting. This delete doesn't actually erase the contents on disk but makes the file invisible to Windows (I'm not sure how this is done; either modifying the FAT/MFT or overwriting the first byte of the file or something). You can undelete these files with some 3rd party tools.

Level 3: Destroy a file. You need a 3rd party utility for this. These tools overwrite the file with data so the contents of the file are destroyed. I believe that if you defrag your drive and consolidate free space, then "permanently" deleted files will also be destroyed since their contents are overwritten by another file. There may be ways to get this data back as well, but you have to use a service that specializes in data recovery.
Former COBOL Programmer Send private email
Wednesday, February 15, 2006
 
 
I'm not aware of any operating systems that, by default, overwrite a file when you delete it. All simply mark the blocks that the file occupies as unused. As a result, the data is still there until the blocks are used for something else.

This is a good thing in most cases. Overwriting the blocks that a file occupied is expensive, and not needed 99% of the time. If it's a problem for you some of the time, "secure deletion" tools are available.
clcr
Wednesday, February 15, 2006
 
 
The levels are correct. However, even after overwriting, it might be possible to recover the data, due to the fact that in the end the data on the disc is analog. So if you have the bits 0, 1, 1, 0 on disk, then overwrite them with 0,1,0,1 the result actually is 0,1.1, 0.3, 0.1. The usual controller interprets this as the correct 0,1,0,1, but more sophisticated technology can pick up that ghost image. That's why 3rd-party-tools usually have the option to overwrite data multiple times, and why government agencies are required to physically destroy hard disks before selling a used computer.
Matthias Winkelmann Send private email
Wednesday, February 15, 2006
 
 
I never knew that.

What about those USB sticks, are they analog at the bit level too?
Larry Lard Send private email
Wednesday, February 15, 2006
 
 
Standard deletion of files (not the recycle bin stuff) in Windows is done by removing the meta-data stored about the file from the file system.
Depends on which filesystem you're using.
If it's FAT32 it's similar to what happens in *nix.
I'm not sure about NTFS though.
But the concept is similar - data stored about the file in the file system (which is again a set of records on disk) is to be deleted. So the file is actually deleted from the file system and not from the physical disk. The removal of actual data is done in an optimistic manner - some other file's data would be written over later on. However, there is no guarantee that the data would be overwritten (which is why some 3rd party programs forcefully write 0s into the space not consumed by the filesystem).

Windows registry is to be ignored in the conversation for the sake of clarity ( please dont cloud the picture by including that as well).
And do not ignore third party utilities like Norton UnErase Wizard aka Norton Protected Recycle Bin which puts the deleted files in a separate folder (you have to periodically clear the trash to actually remove it from the filesystem records).
Vineet Reynolds Send private email
Wednesday, February 15, 2006
 
 
I would imagine that everything is analog at the bit level.
Kero Send private email
Wednesday, February 15, 2006
 
 
"I would imagine that everything is analog at the bit level."

Yep, which is why the DoD method of cleansing a hard disk thoroughly of confidential data is to force-write 0s into every unused cluster, 3 times in a row.
Vineet Reynolds Send private email
Wednesday, February 15, 2006
 
 
"What about those USB sticks, are they analog at the bit level too?"

yup, the whole world is analog.

Wednesday, February 15, 2006
 
 
To Matthias Winkelmann:

That's the best description of ever read... Bravo...

I'm an old EE engineer and I'm familiar with the concept as well.  You describe it very well.

In electromagnetics there is something called hysteriouses which is the cause of the ghost image as I understand it.

I use to design antenna systems -- in antennas you have electrical wave length and physical wave length.  The idea is the sort of the same for the ghost image.
Eric (another ISV guy with his company)
Wednesday, February 15, 2006
 
 
" are they analog at the bit level too"

*Everything* physical is analog at the bit level. Ones and zeroes are just a convenient approximation. Kinda reminds me of Joel's Law of Leaky Abstractions.

(Old analog EEs never die; they just drift out of spec.)
lw Send private email
Wednesday, February 15, 2006
 
 
Everything presented digital is ultimately analog at the lowest level.
Nature itself is analog. Even you're analog !!

The digitisation of data is merely an approximation of the analog level (of voltage, magnetic flux, etc).
Say, we store a series of bits - 0000.
It doesnt matter if the magnetic fields (or flux density) for the locations is 4.1, 5.1 , 4.5 , 2.1 units.
What matters is whether the machine (ultimately it is a analog to digital convertor) can distinguish between data that manifests itself at 4.9 ( or 4.5, 4.6 , 5.1) units  from data at -5 ( or -5.1, - 2.3 - 11.0) units.
Of course there will be a band in between the two digital states; any  analog level in that band would not produce meaningful digital output.
And as the numbder of digital states increase, so will the inability of the system to distinguish between the states (and hence inaccuracy) increase.
Vineet Reynolds Send private email
Wednesday, February 15, 2006
 
 
Data is stored/read at the bit level as a voltage difference. Simplistically this might use 0v for a 0 and 5v for a 1. In the real world, nothing will be exactly 0v or exactly 5v so the hardware will take anything as <1v to equal 0 and anything >4v and <6v as a 1. Anthing outside these ranges will be flagged as an error.
Adrian
Wednesday, February 15, 2006
 
 
"What about those USB sticks, are they analog at the bit level too?"

"yup, the whole world is analog."

But I imagine that reading analog levels of flash is more difficult than reading analog magnetometer readings off of disc media.

However, USB sticks are even more interesting because of the built-in wear-leveling and defect tracking.  Even if you 'destroy' the file using third-party tools, the controller in the stick may have shuffled around the blocks on you, and pieces of your data are still left in blocks that the controller has marked unusable.

For what it is worth, I believe that the DOD melts storage media into slag.  You've got to be pretty darn paranoid to do that.  Recovery becomes more expensive at higher levels.  You have to determine for yourself how expensive you want it to be.

But at the very least, people should be aware that 'deleted' doesn't mean gone.  formatted doesn't mean gone. fdisk'd doesn't mean gone. 

Linux: "dd if=/dev/zero of=/dev/hda" probably means gone for all reasonable purposes.  You could still look at the disks in analog, and find the data, but it would be unreasonably expensive for all but the scaryest three-letter-agencies.
Michael Dwyer Send private email
Wednesday, February 15, 2006
 
 
Here's a freeware file eraser called Eraser.

http://www.tolvanen.com/eraser/

You can se it to overwite 1, 3 or 7 times.
SumoRunner Send private email
Wednesday, February 15, 2006
 
 
==>Yep, which is why the DoD method of cleansing a hard disk thoroughly of confidential data is to force-write 0s into every unused cluster, 3 times in a row.

When I was in the Army, the DOD method of "cleansing" a hard drive was a Thermite grenade. Pull the pin on that puppy and sit it on top of the case. Burns through the case, the hard-drive, the platters, the floor, falls through to the basement floor and burns half-way to China before it's done leaving your documents in a smoldering pile of ash and molten parts. Don't need any of that silly "3 times in a row" crap with that method -- once is sufficient <grin>.

http://en.wikipedia.org/wiki/Thermite
Sgt.Sausage
Wednesday, February 15, 2006
 
 
I presume it was the grenade sitting on top of the case and not you...
Simon Lucy Send private email
Wednesday, February 15, 2006
 
 
I know OS X has a "secure delete" option which I believe overwrites the deleted sector on the HD with 0's instead just deleting the file pointer.
Larry Send private email
Wednesday, February 15, 2006
 
 
I really wonder how much data you could get off a modern hard drive when the data has been overwritten even once. In the old days, though still a pain, it had to be comparatively easy to read off the data because it was so simply written. Now, there's an enormous amount of ECC for any given track because hard drives are writing so dense that it's virtually guaranteed some of the bits will come back bad. If that data gets overwritten, the chance your James Bond-level researcher will get the original data has to be extraordinarily small.
Loaf of Paint
Wednesday, February 15, 2006
 
 
"I'm not aware of any operating systems that, by default, overwrite a file when you delete it."

HP's venerable OpenVMS can easily do it with the DELETE/ERASE command (not by defualt however unless you define that as the default).
Deprecated Send private email
Wednesday, February 15, 2006
 
 
MBJ Send private email
Wednesday, February 15, 2006
 
 
Why would Windows be so insecure that it doesn't completely destroy all traces of a file the moment the user hits the delete button?  Obviously it's because Microsoft is evil and incompetent.  It couldn't possibly be because for the vast majority of users the vast majority of the time, having the files be recoverable is actually a good thing, could it?  Nah, never!


Incidentally, if the objective is to fully corrupt the 'ghost' image so that it's unrecoverable, why wouldn't writing random data (instead of the ever-popular all '0' bytes) be more effective? If you write a 0 several times the original bit bpattern may be reduced from 1001 to 0.00001 0.000001 0.000001 0.00001 which is very subtle, but if each bit has an unequal and genuinely random number of 1's and 0's written over it then you're going to have a lot more trouble figuring out the original pattern.  (Generating perfect random numebrs is a different question, of course, but if it can be done well enough for encryption, why can't it be done for destroying data?)

Wednesday, February 15, 2006
 
 
"Why would Windows be so insecure that it doesn't completely destroy all traces of a file the moment the user hits the delete button?"

I don't think it is Windows so much as just the Computer Science mindset in general.  We believe that if you lose a pointer, you've lost the data.  In the same way, if you no longer have a directory entry and you no longer have a file handle, then the data is lost. 
In reality, neither of these is actually true.  The data is still there, you just don't know where.

At least Unix is somewhat honest about it.  The system call to delete a file is called unlink() -- it unlinks a file from the directory, and does no erasing whatsoever.

"Incidentally, if the objective is to fully corrupt the 'ghost' image so that it's unrecoverable, why wouldn't writing random data (instead of the ever-popular all '0' bytes) be more effective?"

I think that the spec actually says "write zeros, write random, write zeros", where random is sufficiently random.  In practice, I usually just write zeros and be done with it.

Most people don't even do that.  Hey, Intel!  When you sold your servers on eBay, you didn't wipe the drives sufficiently! You also didn't leave anything really interesting on them, either.  Bummer.  On the other hand, the machines from JPL had neither hard drives or memory in them.
Michael Dwyer Send private email
Wednesday, February 15, 2006
 
 
Far too much information can be found in this paper:
  http://wipe.sourceforge.net/secure_del.html
Michael Dwyer Send private email
Wednesday, February 15, 2006
 
 
I'm surprised no one pointed out the possibility that the scanner driver may use temporary files and that the document data may actually exist in several locations on your hard drive.

In fact, if you have two disk drives -- one for data and another for the "system" files -- you may have lingering confidential document data scattered across multiple physical media.

Kinda make you think...
Caffeinated Send private email
Wednesday, February 15, 2006
 
 
Lots of good info above.  I would add +1 for eraser. I like that it adds "erase recycle bin" to the recycle bin options.  So when you really really want it to go away it will.  As for is this a "good idea" not to really delete files?  It is probably more customer service than anything else.  Consider if you loaded the most recent version of DOOMwarcraftOfTheUnrealWorld 2.6 gigs of gaming fun and then you remove it.  Do your really want to wait while it DOD clears 2.6 gigs?

In addition if you are worried about files you are deleting, why not worry about the ones you are not?  For that I use truecrypt http://www.truecrypt.org/  _and_ I turn on the drive encryption on my machine.  This means I need to enter a bios password and god help me if I forget it.  I have been told there is no way to recover the data without the two passwords - but given enough time and energy anyone could eventually brute force them.  However with a nice strong 22 byte password, I am not worried. (I have customer information on my machine, but not launch codes)
MSHack
Wednesday, February 15, 2006
 
 
MSHack's idea of using encrypted filesystems is probably the best thing if you really are paranoid enough to worry about where your files are going.  If you are using Eraser or other similar tool and don't fully understand the problem, then you're probably just wasting your time and effort. 

Consider this: Any time you handle a file in plain text, it is probably stored in memory somewhere.  If that memory gets swapped out, then it is recorded in pagefile.sys in plaintext.  Try this experiment:  search c:\pagefile.sys for one of your passwords.  Linux users, try the same with /proc/kcore.  I bet you're going to find sensitive info in plain text.  So, are you going to secure delete that, too? That ought to be entertaining to watch!  Did you also remember that your laptop has a hibernate partition that is probably hold a weeks-old snapshot of your system memory?

Personally, I think secure erasure of files is a waste of your time.  This is a war you cannot win.  Secure erasure of media, however, is something that everyone should look into -- especially if they are holding sensitive data and ESPECIALLY if your hard drives are going to make it back into the market somehow.
Michael Dwyer Send private email
Wednesday, February 15, 2006
 
 
> I don't think it is Windows so much as just the Computer Science mindset in general.

Actually, it's the panicked "I accidentally deleted a report that's due tomorrow, please help me get it back!!!" support calls that's the main motivation for the 'recycle bin' trick, not some newbie in the computer science department who thinks that clearing a pointer means that the data pointed to no longer exists.

If anything, UNIX is more likely to (eventually) follow windows, and make unlink() actually mean unlink_then_relink_in_recycle_bin_directory().


> I think that the spec actually says "write zeros, write random, write zeros", where random is sufficiently random.  In practice, I usually just write zeros and be done with it.

In other words, many people are still over-estimating their security?  :)


> Hey, Intel!  [...] You also didn't leave anything really interesting on them, either. 

And other people are over-estimating the need for that security.  ;)  If there's no interesting data, then formatting the drive may well be more than sufficient.

Wednesday, February 15, 2006
 
 
Lots of OSs have had things like a file attribute that specified some degree of "scrubbing" upon file deletion.  Lots of scrubber utilities have been around for lots of years as well.

There is nothing wrong and certainly nothing unusual about simply puttting freed disk sectors back onto the "available" list by default when a file is deleted.  I'm surprised nobody has said much about scrubbing freed memory.

Get a life people, nothing to see here.
Jon Dinlea
Wednesday, February 15, 2006
 
 
> Get a life people, nothing to see here.

I dunno, I'd like to see the thermite grenade thing :-)
Tim Evans Send private email
Wednesday, February 15, 2006
 
 
I agree Tim, but from a safe distance ;)
Former COBOL Programmer Send private email
Wednesday, February 15, 2006
 
 
PC File Inspector is a great (free) program for recovering data that's been deleted. I gaurantee you'll be shocked at the amount of data that's still laying around on your hard drive.

http://www.pcinspector.de/file_recovery/uk/welcome.htm

Steve Gibson has a neat little program called Spinrite that's basically a glorified scandisk. It will read and attempt to re-read bits on your hard drive until it can reconstruct them. It will also refresh your hard drive by re-writing all the data just-in-case, and finding any defects before they lead to bad data in the process.

I bring this up not to pimp his product (for all I know, it could be snake oil, though it seems legit), but if you've ever seen it in action, there's a screen that displays what's going on in the program's buffer at all times, and it's just freaky watching the data go by. You expect the graph to look like a square wave [ http://images.google.com/images?q=square%20wave ], but it's pretty damn jagged with lots of stuff in a really iffy area towards the middle.

Whoever said total encryption was the only way to go is right. The reason Windows doesn't have a "true delete" function is because it doesn't exist. It's an analog world, and as such, there's no such thing as 100% in anything, especially in getting rid of things that have already happened.
MarkTAW
Thursday, February 16, 2006
 
 
A video of acid bath and thermite disc destruction methods :) http://video.google.com/videoplay?docid=-4147847319296070400
Artur Sowinski Send private email
Monday, February 20, 2006
 
 

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics
 
Powered by FogBugz