0
votes

I need to load a data file, test.dat, into Matlab. The contents of data file are like

*a682 1233~0.2
*a2345 233~0.8 345~0.2 4567~0.3
*a3457 345~0.9 34557~1.2 34578~0.2 9809~0.1 2345~2.9 23452~0.9 334557~1.2 234578~0.2 19809~0.1 23452~2.9 3452~0.9 4557~1.2 3578~0.2 92809~0.1 12345~2.9 232452~0.9 33557~1.6 23478~0.6 198099~2.1 234532~2.9 …

How to read this type of file into matlab, and use the terms, such as *2345 to identify a row, which links to corresponding terms, including 233~0.8 345~0.2 4567~0.3

Thanks.

1
Do the * always mean a new row? And is there any other relation in the data by row that you would need taken into account?St-Ste-Ste-Stephen

1 Answers

0
votes

Because each of the rows is a different size, you either have to make a cell array, a structure, or deal with adding NaN or zero to a matrix. I chose to use a cell array, hope it is ok! If someone is better with regexp than me please comment, the output cells are now not perfect (i.e. show 345~ instead of 345~0.9) but I am sure it is a minor fix. Here is the code:

datfile = 'test.dat';
text = fileread(datfile);

row1 = regexp(text,'*[a-z]?\d+','match');
data(:,1) = row1';

row2 = regexp(text,'*[a-z]?\d+','split');
row2 = [row2(:,2:end)'];
for i = 1:size(row2,1)
   data{i,2} = regexp(row2{i},'\d+\S\d+\s','split');
end

What this creates is a cell array called data where the first column of every row is your *a682 id and the second column of each row is a cell with your data values. To get them you could use:

data{1} 

to show the id

data{1,2}

to show the cell contents

data{1,2}{1}

to show the specific data point

This should work and is relatively simple!