2
votes

I've written a duplicate finder in Java, but I need to include hard link support for it. Unfortunately, there seems to be no way to dig out a file's MFT entry in Java.

Although there is a method called fileKey() in the BasicFileAttributeView class, it won't work on the NTFS file system (I haven't tested it on ext yet).

I also found the method isSameFile() (in java.nio.file.Path). Does anyone know how this method works? It seems to be doing the right thing, but it returns a Boolean value, so it is worthless for me (I wish to put the results into a map and group them by their MFT entries).

I can always compare the creation times, modification times, etc. for each file, but this is just giving up.

Is there any way to accomplish what I am trying to do in either C++ or Java? I care more about making it work on NTFS than ext.

4

4 Answers

1
votes

You would need to use the FILE_ID_FULL_DIRECTORY_INFORMATION structure along with the NtQueryDirectoryFile function (or the FILE_INTERNAL_INFORMATION structure along with the NtQueryInformationFile, if you already have a handle) inside ntdll.dll (available since Windows XP, if not earlier) to get the 8-byte file IDs and check if they are the same.

This will tell you if they are the same file, but not if they are the same stream of the same file.

I'm not sure how to detect if two files are the same stream from user-mode -- there is a structure named FILE_STREAM_INFORMATION which can return all the streams associated with a file, but it doesn't tell you which stream you have currently opened.

1
votes

Detecting hard links is usually accomplished by calling FindFirstFileNameW. But there is a lower level way.

To get the NTFS equivalent to inodes, try the FSCTL_GET_OBJECT_ID ioctl code.

There's a unique (until the file is deleted) identifier in the BY_HANDLE_FILE_INFORMATION structure as well.

If the volume has an enabled USN Change Journal, you can issue the FSCTL_READ_FILE_USN_DATA ioctl code. Check the FileReferenceNumber member in the USN_RECORD structure

0
votes

In Java you can use sun.nio.ch.FileKey which is a non-transparent enclosure for NTFS Inode. All the hard links share the same Inode.

Therefore, if you need to collect hard links, you can create FileKey from each suspect and compare them (e.g. by putting pairs of FileKey -> File into a Multimap)

0
votes

I find fileKey is always null. Here is some code that can actually read the NTFS inode number. There remain many aspects I'm not happy with, not least, it relies on reflection.

import sun.nio.ch.FileKey;
import java.io.*;
import java.lang.reflect.Field;
import java.nio.file.Path;

class NTFS {
    static long inodeFromPath(Path path) throws IOException, NoSuchFieldException, IllegalAccessException {
        try (FileInputStream fi = new FileInputStream(path.toFile())) {
            FileDescriptor fd = fi.getFD();

            FileKey fk = FileKey.create(fd);
            Field privateField = FileKey.class.getDeclaredField("nFileIndexHigh");
            privateField.setAccessible(true);
            long high = (long) privateField.get(fk);
            privateField = FileKey.class.getDeclaredField("nFileIndexLow");
            privateField.setAccessible(true);
            long low = (long) privateField.get(fk);

            long power = (long) 1 << 32;
            long inode = high * power + low;
            return inode;
        }
    }
}