1
votes

I have a big string which has got data from a csv file, however when using regular expressions such as:

Regex regex = new Regex(@"\w+|""[\w\s]*""");

it splits every letter instead? there are no spaces foreach line, only at the end of the line - but shouldn't be cutting the line where there is a space inside double quotes.

example: test1,test2,test3,test4,test5,"test 6",test7 (new line)test8,test9,etc.

Can somebody guide me in the right direction? thanks

2
Can you use an existing library, such as: codeproject.com/Articles/9258/A-Fast-CSV-Reader?Eric

2 Answers

4
votes

I recommend referring to an existing solution than reinventing your own (unless you're going for the learning experience!) Parsing CSV is trickier than it seems.

EDIT: Didn't see you were using C#. Here are more links.

1
votes

Use an existing CSV parser instead of trying to use Regex - the format is subtle, as you have seen.

FileHelpers is one popular library for this and there is the TextFieldParser living in the Microsoft.VisualBasic.FileIO namespace.