3
votes

I'm reading a csv file using c# here is a little code snippet.

using (StreamReader readFile = new StreamReader("C:\\temp\\" + whichTable))
{
    while ((line = readFile.ReadLine()) != null)
    {
        row = line.Split(',');

        switch (row.Length)
        {
            case 5:
                if (counter == 0) 
                { 
                    break; 
                } 
                else
                {
                    v00.Add(Convert.ToInt32(Regex.Replace(row[0], @"[^\w\.@-]", "")));
                }

                if (row[1] == "") 
                { 
                    v01.Add((1)); 
                }
                else
                {
                    v01.Add(Convert.ToInt32(Regex.Replace(row[1], @"[^\w\.@-]", "")));
                }

                if(row[2]=="")
                {
                    v02.Add(2);
                } 
                else
                {
                    v02.Add(Convert.ToInt32(Regex.Replace(row[2], @"[^\w\.@-]", "")));
                }

                v3.Add(row[4]);
                v4.Add(row[3]);
                counter++;
                break;
        }
        counter++;
    }
    break;
}

as you can tell from my code i test the length of the string row to make sure that its five long exactly. My problem is that if have a field within the csv with a comma it then calculates to more then 5. My csv is well formed so when that happens i do have a that field double quoted. How can i tell c# only count the commas outside of double quotes? Thats really my question.

3
Unless you want to reinvent the wheel, check out this fast CSV reader: codeproject.com/Articles/9258/A-Fast-CSV-ReaderEric J.
Not be a stickler or anything, but you should try and clean up your code (even if it's just when you post it to StackOverflow).Brandon Buck
izuriel whats wrong with the code now? I'm open to suggestions.Miguel
@Miguel You didn't convert your tab length into spaces (and kept your tabs as they were when you copy/pasted the code), and you have your if/else lines minified essentially. I'm not saying that your code is bad and needs to be fixed, I'm just saying the formatting should be cleaned up when posting here, I'm not suggesting you should write your code pretty unless you want to, but for the sake of getting help you may want to make it a bit cleaner and better formatted.Brandon Buck
Thanks for pointing this out. Sorry i was actually thinking i did a good job with the post. I'll do a better job next time. Thank for editing.Miguel

3 Answers

10
votes

Don't parse CSV yourself - the format is more difficult to parse properly than most people realize. There are many exsisting good CSV parsers that you can use instead.

There is the TextFieldParser library that lives in the Microsoft.VisualBasic.FileIO namespace (regular .NET libarary), and many third party ones - FileHelpers is a popular free choice.

2
votes

there are multiple possible workarounds, the simplest would be go down the row character by character and count the commas yourself. if you encounter a quotation mark, you can toggle a boolean, say. bool inQuotes, and when inQuotes is true, you just ignore commas.

0
votes

You can use ths bellow code ,Its working for me :

  private void ImportCSV(string filePath = @"E:\nucc_taxonomy_140.csv", string tableName = "TempTaxonomyCodes")
    {
        string tempPath = System.IO.Path.GetDirectoryName(filePath);
        string strConn = @"Driver={Microsoft Text Driver (*.txt; *.csv)};Dbq=" + tempPath + @"\;Extensions=asc,csv,tab,txt";
        OdbcConnection conn = new OdbcConnection(strConn);
        OdbcDataAdapter da = new OdbcDataAdapter("Select * from " + System.IO.Path.GetFileName(filePath), conn);
        DataTable dt = new DataTable();
        da.Fill(dt);

        using (SqlBulkCopy bulkCopy = new SqlBulkCopy(ConfigurationSettings.AppSettings["dbConnectionString"]))
        {
            bulkCopy.DestinationTableName = tableName;
            bulkCopy.BatchSize = 50;
            bulkCopy.WriteToServer(dt);
        }

    }