0
votes

I have to read a text file and then to parse it, in C# using VS 2010. The sample text is as follows,

[TOOL_TYPE]

; provides the name of the selected tool for programming

“Phoenix Select Advanced”;

[TOOL_SERIAL_NUMBER]

; provides the serial number for the tool

7654321;

[PRESSURE_CORRECTION]

; provides the Pressure correction information requirement

 “Yes”;

[SURFACE_MOUNT]

; provides the surface mount information

“Yes”;

[SAPPHIRE_TYPE]

; provides the sapphire type information

“No”;

Now I have to parse only the string data (in double quotes) and headers (in square brackets[]), and then save it into another text file. I can successfully parse the headers but the string data in double quotes is not appearing correctly, as shown below.

[TOOL_TYPE]
�Phoenix Select Advanced�;
[TOOL_SERIAL_NUMBER]
7654321;
[PRESSURE_CORRECTION]
�Yes�;
[SURFACE_MOUNT]
�Yes�;
[SAPPHIRE_TYPE]
�No�;
[EXTENDED_TELEMETRY]
�Yes�;
[OVERRIDE_SENSE_RESISTOR]
�No�;

Please note a special character (�) which is appearing every time whenever a double quotes appear.

How can I write the double quotes(") in the destination file and avoid (�) ?

Update

I am using the following line for my parsing

temporaryconfigFileWriter.WriteLine(configFileLine, false, Encoding.Unicode);

Here is the complete code I am using:

        string temporaryConfigurationFileName = System.Environment.GetFolderPath(Environment.SpecialFolder.Desktop) + "\\Temporary_Configuration_File.txt";

        //Pointers to read from Configuration File 'configFileReader' and to write to Temporary Configuration File 'temporaryconfigFileWriter'
        StreamReader configFileReader = new StreamReader(CommandLineVariables.ConfigurationFileName);
        StreamWriter temporaryconfigFileWriter = new StreamWriter(temporaryConfigurationFileName);

        //Check whether the 'END_OF_FILE' header is specified or not, to avoid searching for end of file indefinitely
        if ((File.ReadAllText(CommandLineVariables.ConfigurationFileName)).Contains("[END_OF_FILE]"))
        {
            //Read the file untill reaches the 'END_OF_FILE'
            while (!((configFileLine = configFileReader.ReadLine()).Contains("[END_OF_FILE]")))
            {
                configFileLine = configFileLine.Trim();
                if (!(configFileLine.StartsWith(";")) && !(string.IsNullOrEmpty(configFileLine)))
                {
                    temporaryconfigFileWriter.WriteLine(configFileLine, false, Encoding.UTF8);
                }
            }
            // to write the last header [END_OF_FILE]
            temporaryconfigFileWriter.WriteLine(configFileLine);

            configFileReader.Close();
            temporaryconfigFileWriter.Close();
        }
2
you need to provide the source you use for reading/parsing/writing... I suspect there is something off with the Encodings you use along the way...Yahia
Show us the code you use to write text to the output file..Shai

2 Answers

5
votes

Your input file doesn't contain double quotes, that's a lie. It contains the opening double quote and the closing double quote not the standard version.

First you must ensure that you are reading your input with the correct encoding (Try multiple ones and just display the string in a textbox in C# you'll see if it show the characters correctly pretty fast)

If you want such characters to appear in your output you must write the output file as something else than ASCII and if you write it as UTF-8 for example you should ensure that it start with the Byte Order Mark (Otherwise it will be readable but some software like notepad will display 2 characters as it won't detect that the file isn't ASCII).

Another choice is to simply replace and with "

3
votes

It appears that you are using proper typographic quotes (“...”) instead of the straight ASCII ones ("..."). My guess would be that you read the text file with the wrong encoding.

If you can see them properly in Notepad and neither ASCII nor one of the Unicode encodings works, then it's probably codepage 1252. You can get that encoding via

Encoding.GetEncoding(1252)