0
votes

I have been messing with NTFS lately in order to perform quick search (by parsing MFT) which is supposed to reveal files with specific extensions (even if they were deleted) and find their path. The first weird thing that I have encountered is that in all the cases I have seen (3) drive C contains a lot of invalid MFT records (more than 3/4). Most of them (if not all) fail signature validation.

I would normally think that these records are not used but there is another problem which makes me think that something is wrong: when all file records with the required extension are found, some of the records point to parent MFT records which fail validation due to the same reason. But the files are markes as 'in-use' and I see them in explorer. Also, another weird thing is that the directory they are in is valid, since there are files which are in the same directory/subdirectories which point to valid directories (e.g. I have file log.txt which is on the Desktop and it points to an invalid file record. There is also a folder data (on the Desktop, too) which contains a file info.txt and 'data' points to a valid file record).

Signature validation (simplified):

struct FILE_RECORD_HEADER  
{
    uint32      Magic;          //Should match FILE_RECORD_SIGNATURE
    uint16      OffsetOfUS;     //Offset of Update Sequence
    uint16      SizeOfUS;       //Size in 2-byte ints of Update Sequence Number & Array
    uint64      LSN;            //$LogFile Sequence Number
    uint16      SeqNo;          //Sequence number
    uint16      Hardlinks;      //Hard link count
    uint16      OffsetOfAttr;   //Offset of the first Attribute
    uint16      Flags;          //Flags
    uint32      RealSize;       //Real size of the FILE record
    uint32      AllocSize;      //Allocated size of the FILE record
    uint64      RefToBase;      //File reference to the base FILE record. Low 6B - file reference, high 2B - MFT record sequence number
    uint16      NextAttrId;     //Next Attribute Id
    uint16      Align;          //Align to 4 uint8 boundary
    uint32      RecordNo;       //Number of this MFT Record
};

#define FILE_RECORD_SIGNATURE       'ELIF'

FILE_RECORD_HEADER * header = (FILE_RECORD_HEADER *)rawFileRecord; //where rawFileRecord is a pointer to a block of memory in which a file record is stored
if(header->Magic != FILE_RECORD_SIGNATURE) //The file record is invalid

Getting LCN of parent:

struct ATTR_FILE_NAME
{
    uint64  ParentRef;      //File reference to the parent directory. Low 6B - file reference, high 2B - MFT record sequence number
    uint64  CreateTime;     //File creation time
    uint64  AlterTime;      //File altered time
    uint64  MFTTime;        //MFT changed time
    uint64  ReadTime;       //File read time
    uint64  AllocSize;      //Allocated size of the file
    uint64  RealSize;       //Real size of the file
    uint32  Flags;          //Flags
    uint32  ER;             //Used by EAs and Reparse
    uint8   NameLength;     //Filename length in characters
    uint8   NameSpace;      //Filename space
    uint16  Name[1];        //Filename
};

ATTR_FILE_NAME * attr = (ATTR_FILE_NAME*)filenameAttr; //where filenameAttr is a pointer to the beginning of filename attribute somewhere in the rawFileRecord
uint64 parentLCN = attr->ParentRef & 0x0000FFFFFFFFFFFF;

Is it possible to lose (in terms of search) files due to this signature mismatch (I think yes but I want to be sure)? Why do some file records point to invalid parents while other point to valid ones (they are supposed to have the same parent)?

1
Stack Overflow is for programming questions. This question is off-topic.IInspectable
@IInspectable how NTFS structure and its inner workings are not related to programming?16Shadows
No, they are not. This question is about general computing software and hardware. See help center.IInspectable
@IInspectable then why are there about 600 answered questions about ntfs on this site? Why suddenly my question related to encountering weird behaviour when trying to write an NTFS parser is off-topic?16Shadows
If you have an issue with an implementation, then post an minimal reproducible example including the observed as well as expected behavior. As written, you are just asking about technical details of a protocol. That's not a programming problem. I posted a link to the help center already.IInspectable

1 Answers

0
votes

The reference you are searching for is the RecordNo in FILE_RECORD_HEADER.

The Low 6B - file reference in ParentRef is suppose to match with it.

If those two match you have the right file.

Its the main reason why a NTFS can only contain 4'294'967'295 files, since its stored in a uint32.

Personally i found it easier to map everything with the $INDEX_ROOT_ATTR and $INDEX_ALLOCATION_ATTR since you can found the same type of reference and it allows you to follow the tree structure easier, since you can start with the root (his record no is always 5).