1
votes

My console application (C#) is working perfectly for the filenames which don't contain any UTF-8 characters but when the filenames contain any UTF-8 character, my condition if(!File.Exists(destFilePath)) is not working as expected.

I need to delete those files which are only present in the destination but not in the source. When there some special characters in my file name, for example,

file

C:\A00000001\20162350775-Étienne Geoffroy Saint-Hilaire, 1772-1844 a visionary naturalist. Hervé Le Guyader.pdf

destFilePath

D:\A00000001\20162350775-Étienne Geoffroy Saint-Hilaire, 1772-1844 a visionary naturalist. Hervé Le Guyader.pdf

The filename in the above case should not be deleted as both source and destination have the same filename but it did. But for normal filenames, there is no issue. My code sample is as below:

public void SynchronizeSourceAndDestination(string dir)
        {
            foreach (string file in Directory.GetFiles(dir))
            {
                string destFilePath = file.Replace(BackupDirectory, LookupDirectory);

                if (!File.Exists(destFilePath))
                {
                    // Delete file from Backup
                    File.Delete(file);
                }
            }

            foreach (string directory in Directory.GetDirectories(dir))
            {
                string destinationDirectory = directory.Replace(BackupDirectory, LookupDirectory);

                if (!Directory.Exists(destinationDirectory))
                {
                    Directory.Delete(directory, true);
                    continue;
                }
                SynchronizeSourceAndDestination(directory);
            }
        }

Note: The asp.net web application has the setting globalization culture="en-US" uiCulture="en-US" requestEncoding="UTF-8" responseEncoding="UTF-8" fileEncoding="UTF-8" in the web.config file. The above code is C# console application to process the files saved by the web application. There is no issue with the filenames in my local machine but when the code is in the server, it is not working.

2
Rename the file replacing the diacritics with E/e & try again to test your assumption, the name is unlikely to be causing a problem. Note that if the length of the path on the server exceeds ~260 characters or there is a permissions problem Exists() will return false.Alex K.
@Alex K. The length is only 160 characters and the problem it is giving is because of the two character used in the file name É and éSimant
If you use Directory.GetFiles on the LookupDirectory, and get the filename you are interested in from that, and then use == on that against file are they the same or different? Many characters that look the same are in fact different characters.mjwills
What is the value of dir? BackupDirectory? LookupDirectory? file.Replace(BackupDirectory, LookupDirectory)?mjwills
Did you mean to use !File.Exists rather than File.Exists?mjwills

2 Answers

2
votes

It has probably something to do with the length of the filepath (>260 characters) as the File.Exists does work with UTF-8 characters.

I've tested it just a couple of minutes ago with csi.exe, this was the output:

C:\Temp>csi
Microsoft (R) Visual C# Interactive Compiler version 2.2.0.61624
Copyright (C) Microsoft Corporation. All rights reserved.

Type "#help" for more information.
> System.IO.File.Exists("C:\\A00000001\\20162350775-Étienne Geoffroy Saint-Hilai
re, 1772-1844 a visionary naturalist. Hervé Le Guyader.pdf")
true
>

As you can see, the result is true. I've tested this on a Windows 10 machine, Dutch language and have VS2017.2 installed.

--edit-- Just to be complete with the comment below, I've created this console app to test.

using System.IO;

namespace ConsoleApp1
{
    class Program
    {
        private const string BackupDirectory = "C:\\A00000001\\";
        private const string LookupDirectory = "C:\\A00000002\\";
        static void Main(string[] args)
        {
            SynchronizeSourceAndDestination("C:\\A00000001\\");
        }

        public static void SynchronizeSourceAndDestination(string dir)
        {
            foreach (string file in Directory.GetFiles(dir))
            {
                string destFilePath = file.Replace(BackupDirectory, LookupDirectory);

                if (!File.Exists(destFilePath))
                {
                    // Delete file from Backup
                    File.Delete(file);
                }
            }

            foreach (string directory in Directory.GetDirectories(dir))
            {
                string destinationDirectory = directory.Replace(BackupDirectory, LookupDirectory);

                if (!Directory.Exists(destinationDirectory))
                {
                    Directory.Delete(directory, true);
                    continue;
                }
                SynchronizeSourceAndDestination(directory);
            }
        }
    }
}

Make sure the folders A00000001 and A00000002 are present on your system and place a file inside both of them with the same name and UTF-8 characters (20162350775-Étienne Geoffroy Saint-Hilaire, 1772-1844 a visionary naturalist. Hervé Le Guyader.pdf).

In my case, no file got deleted because of the File.Exists check.

0
votes

To make my solution workable I changed extended ASCII character by pressing É (Alt + 144), é (Alt + 130). I think it was because the file creator did some copy and paste of the characters directly.