2
votes

I want to find lines where fields 1 and 2 from file1 match fields 1 and 4 from file2, and then print all rows where these fields match from file2. I am using:

awk 'FNR==NR{a[$1];b[$2];next} $1 in a && $4 in b' file1 file2 > output

I am running into a problem where the output contains lines where fields are matching within the two files, but not within the same row. For example, when line (below) is in file1:

15     70589272    rs12148337     15     70589272    rs12148337            1 

And line (below) is in file2:

10  rs181419901 0   70589272    4   2

The output contains the line (above) from file2 even though field 1 does not match. Ostensibly because field1 does match in another row. Can I restrict the command to printing rows where both fields match only within the same row?

1

1 Answers

4
votes

You were pretty close:

awk 'FNR==NR{a[$1,$2];next} ($1,$4) in a' file1 file2