1
votes

FindNextFile WinApi function is used to list content of directories. Microsoft is stating in documentation, that order is file system dependent. However NTFS should be in alphabetical order most of the time.

The order in which this function returns the file names is dependent on the file system type. With the NTFS file system and CDFS file systems, the names are usually returned in alphabetical order. With FAT file systems, the names are usually returned in the order the files were written to the disk, which may or may not be in alphabetical order. However, as stated previously, these behaviors are not guaranteed.

My application needs some ordering of object in directories. Because majority of Windows users use NTFS, I would like to optimize my application for that case. Therefore I use function _wcsicmp for name compare. Most of the time it is correct and results from FindNextFile are sorted according to _wcsicmp. However sometime result are not sorted. I thought, that it is natural, because FindFirstFile doesn't guaranteed the order and I must sort it anyway (in case of another file system). Then I noticed strange pattern. It looks like character '_' is returned after letters. Folder with content (a.txt, b.txt, _.txt) is returned in order a, b, _. Function _wcsicmp will sort that as _, a, b. Tested on Windows 8.1. I ran some test and this behavior is consistent.

Can someone explain me what is the comparison criteria used by NTFS? Or why is FindNextFile returning names out of alphabetical order?

2
Hmm, you are comparing the order that FindNextFile finds files with the way the file are ordered in Explorer. Which does not use the disk order, it was made for normal people. Who do not understand why file2.txt comes after file11.txt - Hans Passant
@ Hans Passant: Not exactly. I'm comparing order of FindNextFile with order of _wcsicmp. I'm not presenting the list of files to the user, so I can pick whatever sort I prefer. I'm just curious why order of FindNextFile is different from basic case insensitive string compare. '_' is U+005F and 'a' is U+0061. So it seems like '_' should go before 'a'. - koscelansky

2 Answers

6
votes

Because NTFS sort rules are not so simple as just to sort in alphabetical order. Here is an msdn blog article to shed some light on the problem:

Why do NTFS and Explorer disagree on filename sorting?

One reason to this can be that NTFS captures the case mapping table at the time the drive is formatted and continues to use that table, even if the OS's case mapping tables change subsequently.

1
votes

You can use CompareStringEx and set the flag SORT_DIGITSASNUMBERS Minimum system requirement for this function is Windows Vista LINK

int CompareStringEx(0,0x00000008/*SORT_DIGITSASNUMBERS*/,
lpString1, cchCount1, lpString2, cchCount2, NULL, NULL, 0);

Comparison result for this function is weird, it returns 1, 2, or 3:

#define CSTR_LESS_THAN            1           // string 1 less than string 2
#define CSTR_EQUAL                2           // string 1 equal to string 2
#define CSTR_GREATER_THAN         3           // string 1 greater than string 2

You can also try _wcsicoll for older systems. If I recall correctly _wcsicoll works better but not the same as Windows's sort.