0
votes

I'm using the following command to read a csv file:

fid=fopen('test.csv');
scannedData = textscan(fid, '%4.0u%2.0u%2.0u%2.0u%2.0u%2.0u,%u,%u,%q,%q,%f,%f,%.2f,%u','whitespace','"');
fclose(fid);

The problem is that textscan doesn't read the value from the last field and stops after 1 line. Skipping that field, assign it a different type, using numerous eof combinations in the textscan, nothing helped.

The data in the file looks like this :

"20100324072328","501","1","str1","str2","4.6846712","52.0159507","1.250000","128.000000"
"20100324072519","501","1","str1","str2","4.6846122","52.0159346","0.000000","128.000000"
"20100324072640","501","1","str1","str2","4.6846014","52.0159453","0.000000","128.000000"
"20100324072812","501","1","str1","str2","4.6845907","52.0159507","0.000000","96.000000"
"20100324073002","501","1","str1","str2","4.6845800","52.0159614","0.000000","128.000000"

I'd like to parse the first filed directly with textscan as I'm trying with the above commands.

I don't want to use the alternative of reading the fields with %q and then parsing the resulting arrays.

So, I would appreciate any suggestions to make textscan do it all in one go.

Thanks.

1
You need 'Delimiter',','. - Oleg
Adding the delimiter option only parses my first field and the combination doesn't read the rest of the fields. (in combination with the above mentioned reading format) That's why i'm leaving the delimiter out and put the commas in the formatspec. - user2618054
Do not use the whitespace option, but embed the " into the format string, keeping the delimiter to ,. - Oleg
Keeping the 'delimiter' option limits me only to 9 fields thus can't do the parsing of the 1st field and read the rest. - user2618054

1 Answers

1
votes

If you want to consider " as whitespace, then you should not use %q which needs the double quotes to identify the full string and cannot find them if you consider them whitespace:

fid = fopen('test.txt');
fmt = '%4u%2u%2u%2u%2u%2u%u%u%s%s%f%f%f%u';
out = textscan(fid,fmt,'Delimiter',',','Whitespace','"')
fclose(fid)

Alternatively I was suggesting in the comments to use:

fmt = '"%4u%2u%2u%2u%2u%2u" "%u" "%u"%q%q"%f" "%f" "%f" "%u"';
out = textscan(fid,fmt,'Delimiter',',')

note how I space " ", otherwise textscan() cannot recognize when fields really end.

However, I would personally might go for explicit date conversion to serial date

fmt = '%s%u%u%s%s%f%f%f%u';
out = textscan(fid,fmt,'Delimiter',',','Whitespace','"')
out{1} =  datenum(out{1},'yyyymmddHHMMSS');