3
votes

I'm trying to write an octave program that will convert a .mat file to a .csv file. The .mat file has a matrix X and a column vector y. X is populated with 0s and 1s and y is populated with labels from 1 to 10. I want to take y and put it in front of X and write it as a .csv file.

Here is a code snippet of my first approach:

load(filename, "X", "y");
z = [y X];

basename = split{1};
csvname = strcat(basename, ".csv");

csvwrite(csvname, z);

The resulting file contains lots of really small decimal numbers, e.g. 8.560596795891285e-06,1.940359477121703e-06, etc...

My second approach was to loop through and manually write the values out to the .csv file:

load(filename, "X", "y");
z = [y X];

basename = split{1};
csvname = strcat(basename, ".csv");
csvfile = fopen(csvname, "w");

numrows = size(z, 1);
numcols = size(z, 2);

for i = 1:numrows
  for j = 1:numcols
    fprintf(csvfile, "%d", z(i, j));
    if j == numcols
      fprintf(csvfile, "\n");
    else
      fprintf(csvfile, ",");
    end
  end
end

fclose(csvfile);

That gave me a correct result, but took a really long time.

Can someone tell me either how to use csvwrite in a way that will write the correct values, or how to more efficiently manually create the .csv file.

Thanks!

2

2 Answers

4
votes

The problem is that if y is of type char, your X vector gets converted to char, too. Since your labels are nothing else but numbers, you can simply convert them to numbers and save the data using csvwrite:

csvwrite('data.txt', [str2num(y) X]);

Edit Also, in the loop you save the numbers using integer conversion %d, while csvwrite writes doubles if your data is of type double. If the zeros are not exactly zeros, csvwrite will write them with scientific notation, while your loop will round them. Hence the different behavior.

3
votes

Just a heads up your code isn't optimized for Matab / octave. Switch the for i and for j lines around.

Octave is in column major order so its not cache efficient to do what your doing. It will speed up the overall loop by making the change to probably an acceptable time