1
votes

DATA:

Assume the following data format (with a header line in the first row, 500+ rows):

Number, Number, Number, String, String, Number, Number, Number, String, Number, Number, Number

Example: 1,0,3,"Braund, Mr. Owen Harris",male,22,1,0,A/5 21171,7.25,C85,S

MY CODE:

Ignoring the columns 4, 9, 11 and 12(index starting from 1).

[A, B, C, D, E, F, G, H] = textread("train.csv","%d %d %d %*q %s %d %d %d %*s %*s %f %*s %*s","delimiter",",","endofline","\n","headerlines","1");

THE ERROR:

error: invalid conversion from string to real scalar
error: fskipl: invalid number of lines specified
error: called from:
error:   /usr/share/octave/3.6.4/m/io/textread.m at line 71, column 5

I am new to octave and unable to understand the cause of the error. Please guide.

1
...,"headerlines","1") should probably be ...,"headerlines",1) - erikced
That's right. Though the FORMAT is a little messed up. Using %q gives an error: strread: A(I): index out of bounds; value 1 out of bound 0 - dmkathayat

1 Answers

2
votes

Except for the issue with headerlines mentioned above, you have got 13 conversion specifications but only 12 columns, column 9 should correspond to one %*s, not two. If I change the format string to

%d %d %d %*q %s %d %d %d %*s %f %*s %*s

parsing a small sample file works as expected in Matlab.