I have been messing with NTFS lately in order to perform quick search (by parsing MFT) which is supposed to reveal files with specific extensions (even if they were deleted) and find their path. The first weird thing that I have encountered is that in all the cases I have seen (3) drive C contains a lot of invalid MFT records (more than 3/4). Most of them (if not all) fail signature validation.
I would normally think that these records are not used but there is another problem which makes me think that something is wrong: when all file records with the required extension are found, some of the records point to parent MFT records which fail validation due to the same reason. But the files are markes as 'in-use' and I see them in explorer. Also, another weird thing is that the directory they are in is valid, since there are files which are in the same directory/subdirectories which point to valid directories (e.g. I have file log.txt which is on the Desktop and it points to an invalid file record. There is also a folder data (on the Desktop, too) which contains a file info.txt and 'data' points to a valid file record).
Signature validation (simplified):
struct FILE_RECORD_HEADER
{
uint32 Magic; //Should match FILE_RECORD_SIGNATURE
uint16 OffsetOfUS; //Offset of Update Sequence
uint16 SizeOfUS; //Size in 2-byte ints of Update Sequence Number & Array
uint64 LSN; //$LogFile Sequence Number
uint16 SeqNo; //Sequence number
uint16 Hardlinks; //Hard link count
uint16 OffsetOfAttr; //Offset of the first Attribute
uint16 Flags; //Flags
uint32 RealSize; //Real size of the FILE record
uint32 AllocSize; //Allocated size of the FILE record
uint64 RefToBase; //File reference to the base FILE record. Low 6B - file reference, high 2B - MFT record sequence number
uint16 NextAttrId; //Next Attribute Id
uint16 Align; //Align to 4 uint8 boundary
uint32 RecordNo; //Number of this MFT Record
};
#define FILE_RECORD_SIGNATURE 'ELIF'
FILE_RECORD_HEADER * header = (FILE_RECORD_HEADER *)rawFileRecord; //where rawFileRecord is a pointer to a block of memory in which a file record is stored
if(header->Magic != FILE_RECORD_SIGNATURE) //The file record is invalid
Getting LCN of parent:
struct ATTR_FILE_NAME
{
uint64 ParentRef; //File reference to the parent directory. Low 6B - file reference, high 2B - MFT record sequence number
uint64 CreateTime; //File creation time
uint64 AlterTime; //File altered time
uint64 MFTTime; //MFT changed time
uint64 ReadTime; //File read time
uint64 AllocSize; //Allocated size of the file
uint64 RealSize; //Real size of the file
uint32 Flags; //Flags
uint32 ER; //Used by EAs and Reparse
uint8 NameLength; //Filename length in characters
uint8 NameSpace; //Filename space
uint16 Name[1]; //Filename
};
ATTR_FILE_NAME * attr = (ATTR_FILE_NAME*)filenameAttr; //where filenameAttr is a pointer to the beginning of filename attribute somewhere in the rawFileRecord
uint64 parentLCN = attr->ParentRef & 0x0000FFFFFFFFFFFF;
Is it possible to lose (in terms of search) files due to this signature mismatch (I think yes but I want to be sure)? Why do some file records point to invalid parents while other point to valid ones (they are supposed to have the same parent)?