369
votes

I am trying to read a *.csv-file.

The *.csv-file consist of two columns separated by semicolon (";").

I am able to read the *.csv-file using StreamReader and able to separate each line by using the Split() function. I want to store each column into a separate array and then display it.

Is it possible to do that?

20
@Marc: unfortunately in non-english cultures (e.g. Italian) when you save an excel to CSV it uses ";" as separator... this has made CSV a non-standard imo :(digEmAll
I always read CSV as character-separated-values since people call files CSV even if they don't use a comma as separator. And there are so many dialects with different quoting or escaping rules in practice that you can't really talk of a standard even if in theory there is a RFC.CodesInChaos
CSV file extension name should now get change to DSV - Delimiter Separated Values FileAmbuj
For all of the answers that simply split the string on the delimiter character, this is not the best way to go. There are more rules to the CSV format that this will not cover. It is best to use a 3rd party parser. More info- dotnetcoretutorials.com/2018/08/04/csv-parsing-in-net-coreiliketocode

20 Answers

495
votes

You can do it like this:

using System.IO;

static void Main(string[] args)
{
    using(var reader = new StreamReader(@"C:\test.csv"))
    {
        List<string> listA = new List<string>();
        List<string> listB = new List<string>();
        while (!reader.EndOfStream)
        {
            var line = reader.ReadLine();
            var values = line.Split(';');

            listA.Add(values[0]);
            listB.Add(values[1]);
        }
    }
}
219
votes

My favourite CSV parser is one built into .NET library. This is a hidden treasure inside Microsoft.VisualBasic namespace. Below is a sample code:

using Microsoft.VisualBasic.FileIO;

var path = @"C:\Person.csv"; // Habeeb, "Dubai Media City, Dubai"
using (TextFieldParser csvParser = new TextFieldParser(path))
{
 csvParser.CommentTokens = new string[] { "#" };
 csvParser.SetDelimiters(new string[] { "," });
 csvParser.HasFieldsEnclosedInQuotes = true;

 // Skip the row with the column names
 csvParser.ReadLine();

 while (!csvParser.EndOfData)
 {
  // Read current line fields, pointer moves to the next line.
  string[] fields = csvParser.ReadFields();
  string Name = fields[0];
  string Address = fields[1];
 }
}

Remember to add reference to Microsoft.VisualBasic

More details about the parser is given here: http://codeskaters.blogspot.ae/2015/11/c-easiest-csv-parser-built-in-net.html

79
votes

LINQ way:

var lines = File.ReadAllLines("test.txt").Select(a => a.Split(';'));
var csv = from line in lines
          select (from piece in line
                  select piece);

^^Wrong - Edit by Nick

It appears the original answerer was attempting to populate csv with a 2 dimensional array - an array containing arrays. Each item in the first array contains an array representing that line number with each item in the nested array containing the data for that specific column.

var csv = from line in lines
          select (line.Split(',')).ToArray();
42
votes

Just came across this library: https://github.com/JoshClose/CsvHelper

Very intuitive and easy to use. Has a nuget package too which made is quick to implement: http://nuget.org/packages/CsvHelper/1.17.0. Also appears to be actively maintained which I like.

Configuring it to use a semi-colon is easy: https://github.com/JoshClose/CsvHelper/wiki/Custom-Configurations

39
votes

You can't create an array immediately because you need to know the number of rows from the beginning (and this would require to read the csv file twice)

You can store values in two List<T> and then use them or convert into an array using List<T>.ToArray()

Very simple example:

var column1 = new List<string>();
var column2 = new List<string>();
using (var rd = new StreamReader("filename.csv"))
{
    while (!rd.EndOfStream)
    {
        var splits = rd.ReadLine().Split(';');
        column1.Add(splits[0]);
        column2.Add(splits[1]);
    }
}
// print column1
Console.WriteLine("Column 1:");
foreach (var element in column1)
    Console.WriteLine(element);

// print column2
Console.WriteLine("Column 2:");
foreach (var element in column2)
    Console.WriteLine(element);

N.B.

Please note that this is just a very simple example. Using string.Split does not account for cases where some records contain the separator ; inside it.
For a safer approach, consider using some csv specific libraries like CsvHelper on nuget.

35
votes

I usually use this parser from codeproject, since there's a bunch of character escapes and similar that it handles for me.

31
votes

Here is my variation of the top voted answer:

var contents = File.ReadAllText(filename).Split('\n');
var csv = from line in contents
          select line.Split(',').ToArray();

The csv variable can then be used as in the following example:

int headerRows = 5;
foreach (var row in csv.Skip(headerRows)
    .TakeWhile(r => r.Length > 1 && r.Last().Trim().Length > 0))
{
    String zerothColumnValue = row[0]; // leftmost column
    var firstColumnValue = row[1];
}
11
votes

If you need to skip (head-)lines and/or columns, you can use this to create a 2-dimensional array:

    var lines = File.ReadAllLines(path).Select(a => a.Split(';'));
    var csv = (from line in lines               
               select (from col in line
               select col).Skip(1).ToArray() // skip the first column
              ).Skip(2).ToArray(); // skip 2 headlines

This is quite useful if you need to shape the data before you process it further (assuming the first 2 lines consist of the headline, and the first column is a row title - which you don't need to have in the array because you just want to regard the data).

N.B. You can easily get the headlines and the 1st column by using the following code:

    var coltitle = (from line in lines 
                    select line.Skip(1).ToArray() // skip 1st column
                   ).Skip(1).Take(1).FirstOrDefault().ToArray(); // take the 2nd row
    var rowtitle = (from line in lines select line[0] // take 1st column
                   ).Skip(2).ToArray(); // skip 2 headlines

This code example assumes the following structure of your *.csv file:

CSV Matrix

Note: If you need to skip empty rows - which can by handy sometimes, you can do so by inserting

    where line.Any(a=>!string.IsNullOrWhiteSpace(a))

between the from and the select statement in the LINQ code examples above.

10
votes

You can use Microsoft.VisualBasic.FileIO.TextFieldParser dll in C# for better performance

get below code example from above article

static void Main()
{
    string csv_file_path=@"C:\Users\Administrator\Desktop\test.csv";

    DataTable csvData = GetDataTabletFromCSVFile(csv_file_path);

    Console.WriteLine("Rows count:" + csvData.Rows.Count);

    Console.ReadLine();
}


private static DataTable GetDataTabletFromCSVFile(string csv_file_path)
{
    DataTable csvData = new DataTable();

    try
    {

    using(TextFieldParser csvReader = new TextFieldParser(csv_file_path))
        {
            csvReader.SetDelimiters(new string[] { "," });
            csvReader.HasFieldsEnclosedInQuotes = true;
            string[] colFields = csvReader.ReadFields();
            foreach (string column in colFields)
            {
                DataColumn datecolumn = new DataColumn(column);
                datecolumn.AllowDBNull = true;
                csvData.Columns.Add(datecolumn);
            }

            while (!csvReader.EndOfData)
            {
                string[] fieldData = csvReader.ReadFields();
                //Making empty value as null
                for (int i = 0; i < fieldData.Length; i++)
                {
                    if (fieldData[i] == "")
                    {
                        fieldData[i] = null;
                    }
                }
                csvData.Rows.Add(fieldData);
            }
        }
    }
    catch (Exception ex)
    {
    }
    return csvData;
}
5
votes

Hi all, I created a static class for doing this. + column check + quota sign removal

public static class CSV
{
    public static List<string[]> Import(string file, char csvDelimiter, bool ignoreHeadline, bool removeQuoteSign)
    {
        return ReadCSVFile(file, csvDelimiter, ignoreHeadline, removeQuoteSign);
    }

    private static List<string[]> ReadCSVFile(string filename, char csvDelimiter, bool ignoreHeadline, bool removeQuoteSign)
    {
        string[] result = new string[0];
        List<string[]> lst = new List<string[]>();

        string line;
        int currentLineNumner = 0;
        int columnCount = 0;

        // Read the file and display it line by line.  
        using (System.IO.StreamReader file = new System.IO.StreamReader(filename))
        {
            while ((line = file.ReadLine()) != null)
            {
                currentLineNumner++;
                string[] strAr = line.Split(csvDelimiter);
                // save column count of dirst line
                if (currentLineNumner == 1)
                {
                    columnCount = strAr.Count();
                }
                else
                {
                    //Check column count of every other lines
                    if (strAr.Count() != columnCount)
                    {
                        throw new Exception(string.Format("CSV Import Exception: Wrong column count in line {0}", currentLineNumner));
                    }
                }

                if (removeQuoteSign) strAr = RemoveQouteSign(strAr);

                if (ignoreHeadline)
                {
                    if(currentLineNumner !=1) lst.Add(strAr);
                }
                else
                {
                    lst.Add(strAr);
                }
            }

        }

        return lst;
    }
    private static string[] RemoveQouteSign(string[] ar)
    {
        for (int i = 0;i< ar.Count() ; i++)
        {
            if (ar[i].StartsWith("\"") || ar[i].StartsWith("'")) ar[i] = ar[i].Substring(1);
            if (ar[i].EndsWith("\"") || ar[i].EndsWith("'")) ar[i] = ar[i].Substring(0,ar[i].Length-1);

        }
        return ar;
    }

}
4
votes
var firstColumn = new List<string>();
var lastColumn = new List<string>();

// your code for reading CSV file

foreach(var line in file)
{
    var array = line.Split(';');
    firstColumn.Add(array[0]);
    lastColumn.Add(array[1]);
}

var firstArray = firstColumn.ToArray();
var lastArray = lastColumn.ToArray();
4
votes

I have spend few hours searching for a right library, but finally I wrote my own code :) You can read file (or database) with whatever tools you want and then apply the following routine to each line:

private static string[] SmartSplit(string line, char separator = ',')
{
    var inQuotes = false;
    var token = "";
    var lines = new List<string>();
    for (var i = 0; i < line.Length; i++) {
        var ch = line[i];
        if (inQuotes) // process string in quotes, 
        {
            if (ch == '"') {
                if (i<line.Length-1 && line[i + 1] == '"') {
                    i++;
                    token += '"';
                }
                else inQuotes = false;
            } else token += ch;
        } else {
            if (ch == '"') inQuotes = true;
            else if (ch == separator) {
                lines.Add(token);
                token = "";
                } else token += ch;
            }
    }
    lines.Add(token);
    return lines.ToArray();
}
3
votes

Here's a special case where one of data field has semicolon (";") as part of it's data in that case most of answers above will fail.

Solution it that case will be

string[] csvRows = System.IO.File.ReadAllLines(FullyQaulifiedFileName);
string[] fields = null;
List<string> lstFields;
string field;
bool quoteStarted = false;
foreach (string csvRow in csvRows)
{
    lstFields = new List<string>();
    field = "";
    for (int i = 0; i < csvRow.Length; i++)
    {
        string tmp = csvRow.ElementAt(i).ToString();
        if(String.Compare(tmp,"\"")==0)
        {
            quoteStarted = !quoteStarted;
        }
        if (String.Compare(tmp, ";") == 0 && !quoteStarted)
        {
            lstFields.Add(field);
            field = "";
        }
        else if (String.Compare(tmp, "\"") != 0)
        {
            field += tmp;
        }
    }
    if(!string.IsNullOrEmpty(field))
    {
        lstFields.Add(field);
        field = "";
    }
// This will hold values for each column for current row under processing
    fields = lstFields.ToArray(); 
}
2
votes

The open-source Angara.Table library allows to load CSV into typed columns, so you can get the arrays from the columns. Each column can be indexed both by name or index. See http://predictionmachines.github.io/Angara.Table/saveload.html.

The library follows RFC4180 for CSV; it enables type inference and multiline strings.

Example:

using System.Collections.Immutable;
using Angara.Data;
using Angara.Data.DelimitedFile;

...

ReadSettings settings = new ReadSettings(Delimiter.Semicolon, false, true, null, null);
Table table = Table.Load("data.csv", settings);
ImmutableArray<double> a = table["double-column-name"].Rows.AsReal;

for(int i = 0; i < a.Length; i++)
{
    Console.WriteLine("{0}: {1}", i, a[i]);
}

You can see a column type using the type Column, e.g.

Column c = table["double-column-name"];
Console.WriteLine("Column {0} is double: {1}", c.Name, c.Rows.IsRealColumn);

Since the library is focused on F#, you might need to add a reference to the FSharp.Core 4.4 assembly; click 'Add Reference' on the project and choose FSharp.Core 4.4 under "Assemblies" -> "Extensions".

1
votes

I have been using csvreader.com(paid component) for years, and I have never had a problem. It is solid, small and fast, but you do have to pay for it. You can set the delimiter to whatever you like.

using (CsvReader reader = new CsvReader(s) {
    reader.Settings.Delimiter = ';';
    reader.ReadHeaders();  // if headers on a line by themselves.  Makes reader.Headers[] available
    while (reader.ReadRecord())
        ... use reader.Values[col_i] ...
}
1
votes

I am just student working on my master's thesis, but this is the way I solved it and it worked well for me. First you select your file from directory (only in csv format) and then you put the data into the lists.

List<float> t = new List<float>();
List<float> SensorI = new List<float>();
List<float> SensorII = new List<float>();
List<float> SensorIII = new List<float>();
using (OpenFileDialog dialog = new OpenFileDialog())
{
    try
    {
        dialog.Filter = "csv files (*.csv)|*.csv";
        dialog.Multiselect = false;
        dialog.InitialDirectory = ".";
        dialog.Title = "Select file (only in csv format)";
        if (dialog.ShowDialog() == DialogResult.OK)
        {
            var fs = File.ReadAllLines(dialog.FileName).Select(a => a.Split(';'));
            int counter = 0;
            foreach (var line in fs)
            {
                counter++;
                if (counter > 2)    // Skip first two headder lines
                {
                    this.t.Add(float.Parse(line[0]));
                    this.SensorI.Add(float.Parse(line[1]));
                    this.SensorII.Add(float.Parse(line[2]));
                    this.SensorIII.Add(float.Parse(line[3]));
                }
            }
        }
    }
    catch (Exception exc)
    {
        MessageBox.Show(
            "Error while opening the file.\n" + exc.Message, 
            this.Text, 
            MessageBoxButtons.OK, 
            MessageBoxIcon.Error
        );
    }
}
1
votes

My simple static methods to convert csv line to array, and array to csv line.

public static string CsvRowFromStringArray(string[] csvData, char fieldSeparator = ',', char stringQuote = '"')
{
    csvData = csvData.Select(element => {
        if (element.Contains(stringQuote))
        {
            element = element.Replace(stringQuote.ToString(), stringQuote.ToString() + stringQuote.ToString());
        }
        if (element.Contains(fieldSeparator))
        {
            element = "\"" + element + "\"";
        }
        return element;
    }).ToArray();
    return string.Join(fieldSeparator.ToString(), csvData);
}

public static string[] CsvRowToStringArray(string csvRow, char fieldSeparator = ',', char stringQuote = '"')
{
    char tempQuote = (char)162;
    while (csvRow.Contains(tempQuote)) { tempQuote = (char)(tempQuote + 1); }
    char tempSeparator = (char)(tempQuote + 1);
    while (csvRow.Contains(tempSeparator)) { tempSeparator = (char)(tempSeparator + 1); }

    csvRow = csvRow.Replace(stringQuote.ToString() + stringQuote.ToString(), tempQuote.ToString());
    var csvArray = csvRow.Split(fieldSeparator).ToList().Aggregate("",
        (string row, string item) =>
        {
            if (row.Count((ch) => ch == stringQuote) % 2 == 0) { return row + (row.Length > 0 ? tempSeparator.ToString() : "") + item; }
            else { return row + fieldSeparator + item; }
        },
        (string row) => row.Split(tempSeparator).Select((string item) => item.Trim(stringQuote).Replace(tempQuote, stringQuote))).ToArray();
    return csvArray;
}

private bool CsvTestError()
{
    string correctString = "0;a;\"b; c\";\"\"xy;\"this;is; one \"\"long; cell\"\"\"";
    string[] correctArray = new string[] { "0", "a", "b; c", "\"xy", "this;is; one \"long; cell\"" };

    bool error = string.Join("°", CsvRowToStringArray(correctString, ';')) != string.Join("°", correctArray);
    error = (CsvRowFromStringArray(correctArray, ';') != correctString) || error;
    return error;
}
0
votes

Still wrong. You need to compensate for "" in quotes. Here is my solution Microsoft style csv.

               /// <summary>
    /// Microsoft style csv file.  " is the quote character, "" is an escaped quote.
    /// </summary>
    /// <param name="fileName"></param>
    /// <param name="sepChar"></param>
    /// <param name="quoteChar"></param>
    /// <param name="escChar"></param>
    /// <returns></returns>
    public static List<string[]> ReadCSVFileMSStyle(string fileName, char sepChar = ',', char quoteChar = '"')
    {
        List<string[]> ret = new List<string[]>();

        string[] csvRows = System.IO.File.ReadAllLines(fileName);

        foreach (string csvRow in csvRows)
        {
            bool inQuotes = false;
            List<string> fields = new List<string>();
            string field = "";
            for (int i = 0; i < csvRow.Length; i++)
            {
                if (inQuotes)
                {
                    // Is it a "" inside quoted area? (escaped litteral quote)
                    if(i < csvRow.Length - 1 && csvRow[i] == quoteChar && csvRow[i+1] == quoteChar)
                    {
                        i++;
                        field += quoteChar;
                    }
                    else if(csvRow[i] == quoteChar)
                    {
                        inQuotes = false;
                    }
                    else
                    {
                        field += csvRow[i];
                    }
                }
                else // Not in quoted region
                {
                     if (csvRow[i] == quoteChar)
                    {
                        inQuotes = true;
                    }
                    if (csvRow[i] == sepChar)
                    {
                        fields.Add(field);
                        field = "";
                    }
                    else 
                    {
                        field += csvRow[i];
                    }
                }
            }
            if (!string.IsNullOrEmpty(field))
            {
                fields.Add(field);
                field = "";
            }
            ret.Add(fields.ToArray());
        }

        return ret;
    }
}
0
votes

I have a library that is doing exactly you need.

Some time ago I had wrote simple and fast enough library for work with CSV files. You can find it by the following link: https://github.com/ukushu/DataExporter

It works with CSV like with 2 dimensions array. Exactly like you need.

As example, in case of you need all of values of 3rd row only you need is to write:

Csv csv = new Csv();

csv.FileOpen("c:\\file1.csv");

var allValuesOf3rdRow = csv.Rows[2];

or to read 2nd cell of

var value = csv.Rows[2][1];
-2
votes

look at this

using CsvFramework;

using System.Collections.Generic;

namespace CvsParser {

public class Customer
{
    public int Id { get; set; }
    public string Name { get; set; }
    public List<Order> Orders { get; set; }        
}

public class Order
{
    public int Id { get; set; }

    public int CustomerId { get; set; }
    public int Quantity { get; set; }

    public int Amount { get; set; }

    public List<OrderItem> OrderItems { get; set; }

}

public class Address
{
    public int Id { get; set; }
    public int CustomerId { get; set; }

    public string Name { get; set; }
}

public class OrderItem
{
    public int Id { get; set; }
    public int OrderId { get; set; }

    public string ProductName { get; set; }
}

class Program
{
    static void Main(string[] args)
    {

        var customerLines = System.IO.File.ReadAllLines(@"Customers.csv");
        var orderLines = System.IO.File.ReadAllLines(@"Orders.csv");
        var orderItemLines = System.IO.File.ReadAllLines(@"OrderItemLines.csv");

        CsvFactory.Register<Customer>(builder =>
        {
            builder.Add(a => a.Id).Type(typeof(int)).Index(0).IsKey(true);
            builder.Add(a => a.Name).Type(typeof(string)).Index(1);
            builder.AddNavigation(n => n.Orders).RelationKey<Order, int>(k => k.CustomerId);

        }, false, ',', customerLines);

        CsvFactory.Register<Order>(builder =>
        {
            builder.Add(a => a.Id).Type(typeof(int)).Index(0).IsKey(true);
            builder.Add(a => a.CustomerId).Type(typeof(int)).Index(1);
            builder.Add(a => a.Quantity).Type(typeof(int)).Index(2);
            builder.Add(a => a.Amount).Type(typeof(int)).Index(3);
            builder.AddNavigation(n => n.OrderItems).RelationKey<OrderItem, int>(k => k.OrderId);

        }, true, ',', orderLines);


        CsvFactory.Register<OrderItem>(builder =>
        {
            builder.Add(a => a.Id).Type(typeof(int)).Index(0).IsKey(true);
            builder.Add(a => a.OrderId).Type(typeof(int)).Index(1);
            builder.Add(a => a.ProductName).Type(typeof(string)).Index(2);


        }, false, ',', orderItemLines);



        var customers = CsvFactory.Parse<Customer>();


    }
}

}