0
votes

I have some code which is parsing the $MFT on an NTFS disk.

All works perfectly, except that a handful of records (roughly 10 out of 60000) return incorrect characters in the file name. See the screenshot below:

enter image description here

Note the Unicode character defined by byte '0E'. In all other applications, this is an underscore character. See below:

enter image description here

Even in the $INDEX_ROOT attribute of the containing directory, it has the correct name:

enter image description here

Am I reading the $FILE_NAME attribute wrong? Or should I ignore what's there and always use the name from the $INDEX_ROOT attribute of the directory instead? This seems a bit backwards?

Note: it isn't always '0E', and isn't always this file name, but seems to always be only one character which is wrong in each 'bad' record.

1

1 Answers

0
votes

For anyone in the future, I stumbled across the answer while reading this link:

The fixup array starts at offset 0x30. The first two bytes (0x 8c 06) are the last two bytes in every sector of the record. The real last couple of bytes in all the sectors are stored in the fixup array that follows, namely all zeroes.

Noting that your values will be different, but that you'll notice your 'bad' file names are present whenever the filename attribute spans across a sector boundary (as in the above screenshots from WinHex). Once the end of sector bytes are replaced with the relevant fixup bytes, the filenames are then correct.