The Design of Software (CLOSED)

A public forum for discussing the design of software, from the user interface to the code architecture. Now closed.

The "Design of Software" discussion group has been merged with the main Joel on Software discussion group.

The archives will remain online indefinitely.

Comparing the file paths.

Say, I have a file called test.txt in C:\temp. Therefore, the absolute path is: "C:\temp\test.txt".
And, if the temp folder is shared, the share name being temp1. Then, the network path for this file would be: "\\mymachine\temp1\test.txt".
Both these absolute paths refer to the same file. Given these two paths, is there a way to programmatically infer that they point to the same file.
Can we use Win32 functions, or a combination of Win32 functions to achieve this?

(P.S.: Opened both the files, but could not find any API that uses the handle to retrieve information about the file).

Thanks.
Kiran
Monday, October 15, 2007
 
 
One of the main security flaws that people have in programs is related to "canonicalization errors". This results from incorrectly determining that two file paths are the same.

Google for "canonical path" or similar and you will probably find your answer and some tips on what not to do.
anon
Monday, October 15, 2007
 
 
In "Writing Secure Code 2" from Microsoft press there is a whole chapter on canonicalization issues. One of the win32 functions they talk about using to help prevent these issues is GetLongPathName. You might include it in your Google search for a solution.
anon
Monday, October 15, 2007
 
 
If you're on the same machine maybe you could tell (ie maybe GetLongPathName would help in that instance), but if you're looking at remote files (ie \\machine\share1\test.txt and \\machine\share2\test.txt) you're in trouble because share1 may or may not point to the same location as share2. 

If the files are small you could CRC the contents.  You could also try grabbing the file times (created, last written too, etc) and maybe the file owner as an approximation. 

I've never heard of any kind of file-system unique ID that is assigned to each file.  If you find one, please post it!
Doug
Monday, October 15, 2007
 
 
Doug,

On Linux/Unix, each file in a filesystem has a unique "inode".  Each filesystem also has a number identifying it, valid for as long as the filesystem is mounted and available.

The tuple (filesystem, inode) uniquely identifies a file, and this information is available from the stat() or fstat() system call.

The OP's problem is solved on Unix by calling stat() on both paths, and comparing the fsid and inode returned from each file.
David Jones Send private email
Wednesday, October 17, 2007
 
 
Windows has a similar facility: Open a handle to both files, then call GetFileInformationByHandle on both handles.  If nFileIndexHigh and nFileIndexLow are identical, they're the same file.
Rob
Wednesday, October 17, 2007
 
 
The warning in the BY_HANDLE_FILE_INFORMATION structure which is returned by GetFileInformationByHandle gives me pause:

"Note that this value is useful only while the file is open by at least one process. If no processes have it open, the index may change the next time the file is opened. "

On the other hand, you can't get this info unless you have the file open, so I guess as long as you open both files at the same time, you can tell if they're the same or not.  Unfortunately, you can't store that value (in a database for example) to see if you run into the same file later.

The other warning that is a little ominous:
"Depending on the underlying network components of the operating system and the type of server connected to, the GetFileInformationByHandle function may fail, return partial information, or full information for the given file. In general, you should not use GetFileInformationByHandle unless your application is intended to be run on a limited set of operating system configurations"

Not quite as solid as the Unix solution, but I guess it is something...
Doug
Wednesday, October 17, 2007
 
 
Vista, at least, has added a symbolic link capability. You need to think through the implications of that, as well.
frustrated
Wednesday, October 17, 2007
 
 

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics
 
Powered by FogBugz