1
votes

I have a file (file_1) that I would like to read through and grep each line from another file (file_2), but it should only match from the first column from this file.

file_1

1
2
78
GL.1234
22

file_2

#blahblah hello this is some file
1 this is still some file 345
1 also still a 12 file
78 blah blah blah
22 oh my gosh, still a file!
GL.1234 hey guys, it's me. just being a file
2 i think that's it. 

output

1 this is still some file 345
1 also still a 12 file
2 i think that's it. 
22 oh my gosh, still a file!
78 blah blah blah
GL.1234 hey guys, it's me. just being a file

I have tried:

cat file_1.txt | while read line; do awk -v line = $line '{if ($1 == line) print $0;}' < file_2.txt > output.txt; done

and

cat file_1.txt | while read line; do grep -E '$line\b' < file_2.txt > output.txt; done 
1
2 i think that's it. will also be in output because you have 2 in file1anubhava

1 Answers

4
votes

Looking at your script it seems it can all be done in a single awk:

awk 'NR==FNR{seen[$1]; next} $1 in seen' file1 file2

Output:

1 this is still some file 345
1 also still a 12 file
78 blah blah blah
22 oh my gosh, still a file!
GL.1234 hey guys, it's me. just being a file
2 i think that's it.

Basically we swipe through file first and store first column in an associative array seen. Later we check whether column1 of file2 exists in this array and print the record.