3
votes

I have two files (delimiter is tab)

File1: db.txt

string1 string2 string3 001 string4
string5 string6 string7 002 string8
string9 string10 string11 003 string12

File2: query.txt

id1 001
id2 003

and I wand to match file1 and file2 and print (if there is a match) column 1 to 5 of db.txt and column 1 of query.txt

I tried using awk, here my code:

awk 'BEGIN{FS=OFS="\t"}NR==FNR{a[$2]=$4;next}$4 in a{print $1,$2,$3,$4,$5,a[$1]}' query.txt db.txt

but I only get a file with matches (? I at least think so) and columns of the db.txt file

EDIT: my more complex db2.txt

string1 <TAB> string2 <TAB> 9999 abc dehi [way:pn9999] <TAB> 001 <TAB> org; string3 string4
string5 <TAB> string6 <TAB> 9999 dwd meti [way:pn8999] <TAB> 002 <TAB> org2; string7
string8 <TAB> string9 <TAB> 9999 dwd meti [way:pn7999] <TAB> 003 <TAB> org4; string10
2
Is that key always in the 4th field of the db file? - James Brown
a different key sometimes, but yes - rororo
follow up question: I have a problem with two slightely different files. I want to match 2 files based on the first column, when there is a match it should print column 2 of file1 and file2. So that is my code: awk 'BEGIN{FS=OFS="\t"} FNR == NR { a[$1] = $1; next } $1 in a { print a[$2], $2 }' - rororo
Post it as a new question, you get more coverage for it. - James Brown

2 Answers

1
votes

You can use awk like this:

awk 'BEGIN{FS=OFS="\t"} FNR == NR { a[$2] = $1; next }
$4 in a { print $0, a[$4] }' query.txt db.txt

string1 string2 string3 001 string4 id1
string9 string10 string11 003 string12 id2
1
votes
AMD$ cat f1
id1 001
id2 003

AMD$ cat f2
string1 string2 string3 001 string4
string5 string6 string7 002 string8
string9 string10 string11 003 string12

AMD $ awk 'NR==FNR {a[$2]=$1; next} {for(i in a) if(index($0,i)) print a[i], $0}' f1 f2
id1 string1 string2 string3 001 string4
id2 string9 string10 string11 003 string12