2
votes

I would like to compare the contents of two files, File1.txt and File2.txt. When a line in column 1 of File2.txt matches a line in column 1 of File1.txt, I would like to output the whole line from File2.txt. If there is no match in File2.txt for the line in column 1 of File1.txt, then I would like to print the contents of the line in column 1 of File1.txt, then tab, then 0. I would also like to try to preserve the order of lines in column 1 in the output.

File1.txt

abc
def
ghi
jkl

File2.txt

abc    2
ghi    1

This is the command I have been using, but it only outputs the instances where there is a match. It does not print instances where there is no match followed by a 0 separated by a tab.

awk 'NR==FNR{a[$1];next} $1 in a{if ($1 in a) print $0;else print a[$1],"\t","0"}' File1.txt File2.txt 

What I think the code is doing below:

awk 'NR==FNR{a[$1];next} : create an array for column 1 of the first file.

$1 in a : loop through the array.

{if ($1 in a) print $0; if the line in File2.txt matches the line in the array, print all of the line in File2.txt

;else print a[$1],"\t","0"}' : if the line in File2.txt does not match a line in the array, print the line in File1.txt, tab, then "0".

but this is clearly not the case. I do not understand what I have done wrong.

Current output:

abc    2
ghi    1

Desired output:

abc    2
def    0
ghi    1
jkl    0

Can anyone explain why this does not print contents of line in File1.txt,\t,0 when there is not a match.

1

1 Answers

6
votes

Could you please try following. Written with shown samples. You should change sequence of reading of your Input_file(s) in awk. Another reason for reading file1.txt after file2.txt is because output is printed as per file1 and it's comparing values from file2 so better to read file2 first have all values in array then later while reading file1 print values accordingly

awk '
FNR==NR{
  arr[$1]=$2
  next
}
{
  print $0,($1 in arr?arr[$1]:0)
}
' file2.txt file1.txt