4
votes

I am trying to read in a file with contains thousands of lines of the format:

AAAAAAAA    2013.99.2314.029    0    OFF    N

Which is a tab delimited file. The last column is a don't care. The two columns before that are variable, so I read them as strings. My main problem is the second column. It is a number that is divided into several parts

2013.99.2314.029

is year 2013, day 99, second 2314.029.

I want to use textscan to read in the whole file at once, but somehow split that complicated date string as I read it in.

Currently I have the scan string:

SCAN_STR = '%s\t%f.%f\t%s\t%s\t%*s'

Which reads the date string into two floats. What I'd really like is to read it into two ints and a float. But using

SCAN_STR = '%s\t%d.%d.%f\t%s\t%s\t%*s'

Truncates it to 2013 and 2314 and messes up the rest of the line. I tried escaping the '.' with '.' but that pops an error.

Any suggestions? I'd like to do this as it's scanned in due to the large size of the file. Memory runs low when you start trying to change the types of large data sets.

EDIT:

Really I need a scan string for 2013.99.2314.029 to return two integers and a float.

'%d.%d.%f'

Doesn't work. Nor does using delimiter as '.'. I tried %u as well. It rounds the decimal as it reads them in.

Le sigh.

1
First thing that comes to mind: instead of %d, you can try %[0-9]. However, this would read the integers as strings, you'll have to convert them to numbers later (e.g using str2num) if you need their numerical value. - Eitan T
Use textscan once for the whole line, then textscan again on the field you want to split up further? - Ansari
I hate textscan. Why don't they put some decent text parsing in Matlab? - Bitwise

1 Answers

0
votes

I just tried this with MATLAB 2012b and it seems to work on my end.

SCAN_STR = '%s\t%4d.%d.%f\t%d\t%s%*[^\n]'