5
votes

I need to compare column 1 and column 2 of my file1.txt and file2.txt. If both columns match, print the entire row of file1.txt, but where a row in file1.txt is not present in file2.txt, also print that missing row in the output and add "0" as its value in third column.

# file1.txt #

AA ZZ   
JB  CX
CX  YZ  
BB  XX
SU  BY  
DA  XZ  
IB  KK  
XY  IK
TY  AB

# file2.txt #

AA ZZ   222
JB  CX  345
BB  XX  3145
DA  XZ  876
IB  KK  234
XY  IK  897

Expected output # output.txt #

File1.txt
AA ZZ   222
JB  CX  345
CX  YZ  0
BB  XX  3145
SU  BY  0
DA  XZ  376
IB  KK  234
XY  IK  897
TY  AB  0

I tried this code but couldn't figure out how to add rows that did not match and add "0" to it

awk 'BEGIN { while ((getline <"file2.txt") > 0) {REC[$1]=$0}}{print REC[$1]}' < file1.txt > output.txt
3

3 Answers

5
votes

With your shown samples, could you please try following.

awk '
FNR==NR{
  arr[$1 OFS $2]
  next
}
(($1 OFS $2) in arr){
  print
  arr1[$1 OFS $2]
}
END{
  for(i in arr){
    if(!(i in arr1)){
      print i,0
    }
  }
}
' file1.txt file2.txt

Explanation: Adding detailed explanation for above.

awk '                    ##Starting awk program from here.
FNR==NR{                 ##Checking FNR==NR condition which will be TRUE when file1.txt is being read.
  arr[$1 OFS $2]         ##Creating array with 1st and 2nd field here.
  next                   ##next will skip all further statements from here.
}
(($1 OFS $2) in arr){    ##Checking condition if 1st and 2nd field of file2.txt is present in arr then do following.
  print                  ##Print the current line here.
  arr1[$1 OFS $2]        ##Creating array arr1 with index of 1st and 2nd fields here.
}
END{                     ##Starting END block of this program from here.
  for(i in arr){         ##Traversing through arr all elements from here.
    if(!(i in arr1)){    ##Checking if an element/key is NOT present in arr1 then do following.
      print i,0          ##Printing index and 0 here.
    }
  }
}
' file1.txt file2.txt    ##Mentioning Input_file names here.
4
votes

You may try this awk:

awk '
FNR == NR {
   map[$1,$2] = $3
   next
}
{
   print $1, $2, (($1,$2) in map ? map[$1,$2] : 0)
}' file2 file1

AA ZZ 222
JB CX 345
CX YZ 0
BB XX 3145
SU BY 0
DA XZ 876
IB KK 234
XY IK 897
TY AB 0
1
votes
$ awk '
    { key = $1 FS $2 }
    NR==FNR { map[key]=$3; next }
    { print $0, map[key]+0 }
' file2.txt file1.txt
AA ZZ 222
JB  CX 345
CX  YZ 0
BB  XX 3145
SU  BY 0
DA  XZ 876
IB  KK 234
XY  IK 897
TY  AB 0