4
votes

Data

Assume the following data format (with a header line in the first row, 500+ rows):

1, "<LastName> ,<Title>. <FirstName>", <Gender>, 99.9

My Code

I've tried this (IGNORE: see edit below):

[flag, name, gender, age] = textread('file.csv', '%d %q %s %f', 'headerlines', 1);

The Error

...and get the following error message

error: textread: A(I): index out of bounds; value 1 out of bound 0
error: called from: 
error:   C:\Program Files\Octave\Octave3.6.2_gcc4.6.2\share\octave\3.6.2\m\io\textread.m at line 75, column 3

Questons:

  • Is my format string incorrect given the text qualifier (and the comma embedded in the "name" string)?
  • Am I even using the correct method of loading a CSV into MATLAB\Octave?

EDIT

I forgot the delimiter (error message returns failure on different line in strread.m):

[flag, name, gender, age] = textread('file.csv', '%d %q %s %f', 'headerlines', 1, 'delimiter', ',');
1
Those functions are for numerical data only - Adrian Torrie

1 Answers

0
votes

I went with this, it however splits the text qualified string for the name field into two separate fields, so any text qualified fields that contain the field delimiter in the string will create an extra output column (I'm still interested to know why the %q format didn't work for this field -> whitespace perhaps?):

% Begin CSV Import ============================================================================

    % strrep is used to strip the text qualifier out of each row. This is wrapped around the
    % call to textread, which brings the comma delimited data in row-by-row, and skips the 1st row,
    % which holds column field names.
    tic;
    data = strrep(
                    textread(
                                'file.csv'          % File name within current working directory
                                ,'%s'               % Each row is a single string
                                ,'delimiter', '\n'  % Each new row is delimited by the newline character
                                ,'headerlines', 1   % Skip importing the first n rows
                            )
                    ,'"'
                    ,''
                );

    for i = 1:length(data)
        delimpos = findstr(data{i}, ",");

        start = 1;
        for j = 1:length(delimpos) + 1,

            if j < length(delimpos) + 1,
                csvfile{i,j} = data{i}(start:delimpos(j) - 1);
                start = delimpos(j) + 1;
            else
                csvfile{i,j} = data{i}(start:end);
            end

        end
    end

    % Return summary information to user
    printf('\nCSV load completed in -> %f seconds\nm rows returned = %d\nn columns = %d\n', toc, size(csvfile)(1), size(csvfile)(2));