0
votes

Via textscan, I am reading a number of .txt files:

fid1 = fopen('Ev_An_OM2_l5_5000.txt','r');

This is a simplification as in reality I am loading several hundred .txt files via:

files = dir('Ev_An*.txt');

Important information not present within the .txt files themselves are instead part of the filename.

Is there a way to concisely extract portions of the filename and save them as strings/numbers? For example saving 'OM2' and '5000' from the above filename as variables.

fileparts appears to require the full path of the file rather than just defaulting to the MATLAB folder as with textscan.

1
How do you construct the filename in the first place, to be used in fopen? Why do you need to reverse-engineer?Andras Deak
Added an edit to explain this.AnnaSchumann
You misunderstood my question. How can you tell matlab to open 'Ev_An_OM2_l5_5000.txt' and not something else? How do you construct this filename? Do you have a database with the filenames?Andras Deak
This is a simplified example for the purposes of the question - I am otherwise using files = dir('Ev_An*.txt'); to batch load hundreds of .txt files with similar filenames from the default MATLAB directory.AnnaSchumann
Thanks, this is what I was interested in. See my answer below whether it's suitable.Andras Deak

1 Answers

2
votes

It depends on how fixed your filename is. If your filename is in the string filename, then you can use regexp to extract parts of your filename, like so:

filename = 'Ev_An_OM2_l5_5000.txt'; %or whatever
parts = regexp(filename,'[^_]+_[^_]+_([^_]+)_[^_]+_([^\.]+)\.txt','tokens');

This will give you parts{1}=='OM2' and parts{2}=='5000', assuming that your filename is always in the form of

something_something_somethingofinterest_something_somethingofinterest.txt

Update:
If you like structs more than cells, then you can name your tokens like so:

parts = regexp(filename,'[^_]+_[^_]+_(?<first>[^_]+)_[^_]+_(?<second>[^\.]+)\.txt','names');

in which case parts.first=='OM2' and parts.second=='5000'. You can obviously name your tokens according to their actual meaning, since they are important. You just have to change first and second accordingly in the code above.

Update2:
If you use dir to get your filenames, you should have a struct array with loads of unnecessary information. If you really just need the file names, I'd use a for loop like so:

files = dir('Ev_An*.txt');
for i=1:length(files)
   filename=files(i).name;
   parts = regexp(filename,'[^_]+_[^_]+_(?<first>[^_]+)_[^_]+_(?<second>[^\.]+)\.txt','tokens');

   %here do what you want with parts.first, parts.second and the file itself
end