0
votes

I am trying to understand how two file processing works. So here created an example. file1.txt

zzz pq Fruit Apple 10
zzz rs Fruit Car 50
zzz tu Study Book 60

file2.txt

aa bb Book 100
cc dd  Car 200
hj kl XYZ 500
ee ff Apple 300
ff gh ABC 400

I want to compare 4th column of file1 to 3rd column of file2, if matched then print the 3rd,4th,5th column of file1 followed by 3rd, 4th column of file2 with sum of 5th column of file1 and 4th column of file2.

Expected Output:

Fruit Apple 10 300 310
Fruit Car 50 200 250
Study Book 60 100 160

Here what I have tried:

awk ' FNR==NR{ a[$4]=$5;next} ( $3 in a){ print $3, a[$4],$4}' file1.txt file2.txt

Code output;

Book  100
Car  200
Apple  300

I am facing problem in printing file1 column and how to store the other column of file1 in array a. Please guide me.

1

1 Answers

1
votes

Could you please try following.

awk 'FNR==NR{a[$4]=$3 OFS $4 OFS $5;b[$4]=$NF;next} ($3 in a){print a[$3],$NF,b[$3]+$NF}' file1.txt  file2.txt

Output will be as follows.

Study Book 60 100 160
Fruit Car 50 200 250
Fruit Apple 10 300 310

Explanation: Adding explanation for above code now.

awk '                              ##Starting awk program here.
FNR==NR{                           ##Checking condition FNR==NR which will be TRUE when first Input_file named file1.txt is being read.
  a[$4]=$3 OFS $4 OFS $5           ##Creating an array named a whose index is $4 and value is 3rd, 4th and 5th fields along with spaces(By default OFS value will be space for awk).
  b[$4]=$NF                        ##Creating an array named b whose index is $4 and value if $NF(last field of current line).
  next                             ##next keyword will skip all further lines from here.
}
($3 in a){                         ##Checking if 3rd field of current line(from file2.txt) is present in array a then do following.
  print a[$3],$NF,b[$3]+$NF        ##Printing array a whose index is $3, last column value of current line and then SUM of array b with index $3 and last column value here.
}
' file1.txt  file2.txt             ##Mentioning Input_file names file1.txt and file2.txt