0
votes

I have 1000 text files and want to read a number from each file.

format of text file as:

 af;laskjdf;lkasjda123241234123
$sakdfja;lskfj12352135qadsfasfa
falskdfjqwr1351

##alskgja;lksjgklajs23523,
asdfa#####1217653asl123654fjaksj
 asdkjf23s#q23asjfklj
asko3

I need to read the number ("1217653") behind "#####" in each txt file.

The number will follow the "#####" closely in all text file.

"#####" and the close following number just appear one time in each file.

clc
clear
MyFolderInfo = dir('yourpath/folder');
fidin = fopen(file_name,'r','n','utf-8');
while ~feof(fidin)
    tline=fgetl(fidin);
    disp(tline)
end
fclose(fidin); 

It is not finish yet. I am stuck with the problem that it can not read after the space line.

2
It would actually help if you showed a realistic version of your text file, not a result of you mashing your keyboard with random characters. - rayryeng
XJ.C, the reason for rayryeng's request is that it's very common that people provide some quick sample data, but when people provide an answer that works for that case they respond: "This doesn't work, because my actual data doesn't contain blah blah", or "What I actually need is all data behind #####, not only numbers. All of us have been burned on this, so we like to make sure the scenario we work with is actually realistic, otherwise it's just a waste of time for everyone involved. - Stewie Griffin
Also, this question doesn't follow the guidelines in How to Ask. Please show us what you have tried, this is not a code writing service =) - Stewie Griffin
@StewieGriffin I just want to read the number behind "#####", say "1217653" on the above sample text. - XJ.C
Can there be several instances? If so, how do you want to store the results? Can there be several instances on a single line? What about #####abc123? Do you want 123 there or not? - Stewie Griffin

2 Answers

4
votes

This is another approach using the function regex. This will easily provide a more advanced way of reading files and does not require reading the full file in one go. The difference from the already given example is basically that I read the file line-by-line, but since the example use this approach I believe it is worth answering. This will return all occurences of "#####NUMBER"

function test()
h = fopen('myfile.txt');
str = fgetl(h);
k = 1;
while (isempty(str) | str ~= -1 ) % Empty line returns empty string and EOF returns -1
    res{k} = regexp(str,'#####\d+','match');
    k = k+1;
    str = fgetl(h);
end

for k=1:length(res)
    disp(res{k});
end

EDIT

Using the expression '#####(\d+)' and the argument 'tokens' instead of 'match' Will actually return the digits after the "#####" as a string. The intent with this post was also, apart from showing another way to read the file, to show how to use regexp with a simple example. Both alternatives can be used with suitable conversion.

2
votes

Assuming the following:

  • All files are ASCII files.
  • The number you are looking to extract is directly following #####.
  • The number you are looking for is a natural number.
  • ##### followed by a number only occurs once per file.

You can use this code snippet inside a for loop to extract each number:

regx='#####(\d+)';
str=fileread(fileName);

num=str2double(regexp(str,regx,'tokens','once'));

Example of for loop

This code will iterate through ALL files in yourpath/folder and save the numbers into num.

regx='#####(\d+)'; % Create regex

folderDir='yourpath/folder';
files=cellstr(ls(folderDir)); % Find all files in folderDir
files=files(3:end); % remove . and ..

num=zeros(1,length(files)); % Pre allocate

for i=1:length(files) % Iterate through files
str=fileread(fullfile(folderDir,files{i})); % Extract str from file
num(i)=str2double(regexp(str,regx,'tokens','once')); % extract number using regex
end

If you want to extract more ''advanced'' numbers e.g. Integers or Real numbers, or handle several occurrences of #####NUMBER in a file you will need to update your question with a better representation of your text files.