0
votes

I've got a fairly large and complex text file to read into MATLAB. The basic format looks something like this:

000723       4       123.12345       5       234.76543   ...    178.94444\n

The first column is always a six-digit date in yymmdd format, and the last column is always a double with "\n" on the end, and does not have an integer column preceding it. The "..." indicates where you would see more columns if they existed. Additional columns all come in pairs and follow the format of the preceding few, i.e.:

integer       double

How might I go about doing this? It seems that most of the options to read data in require me to know the dimensions, but it is constantly changing with this dataset, and will always have variable columns in each row.

I'd love to get it into a simple matrix where the columns are:

date(from datenum) - double corresponding with integer 1 - double corresponding with integer 2 - ... - final double value

And if there was no occurrence of the integer in that row then it just gives a 0 or NaN in that matrix location.

1
The importdata function may help - Luis Mendo
Yeah, I've used importdata to do this in the past. It at least gets the values into MATLAB in a sensible manner, but the columns don't align and I have to use a separate function I've written to get it all organized. I was hoping for a more direct "all-at-once" method than this. - A Blue Shoe

1 Answers

0
votes

If importdata doesn't work, I'd try something like textscan.

Simply import your file:

fid = fopen(FILENAME, 'rt');

Then simply specify the type of data for the columns, like so:

a = textscan(fid, '%s %f %f %f %f %f');

Then convert the first column of type String into MATLAB dates and build a matrix:

data = datenum(a{1});

[m n] = size(data);
for j=2:1:n
    data = horzcat(data, a{j});
end

I've taken this approach on data sets before, but you also have a \n character that needs to be accounted for, otherwise MATLAB will display the last column as NaN.

Here's an iterative solution I quickly came up with:

data = [];

%// Iterate through all the lines in the file
tline = fgets(fid);
while ischar(tline)
    %// Remove the newline character from the expression
    str = regexprep(tline,'\\n','');

    %// Vertically concatenated with the global data set
    vertcat(data, textscan(str, '%s %f %f %f %f %f'));

    %// Get the next line
    tline = fgets(fid);
end

I can't guarantee these quick code samples aren't bug free, but I hope they help you!