2
votes

I am new to awk and the shell generally. i want to manipulate some files and find the common rows to two files based on a column

and write the combination of the row from file1 and row from file2 as a row in file3.

I have checked many proposed solutions online, which brought me to getting the following results. The file structure and the commands I used are as follows.

file1.tab

 name    level   regno   dept    sex
 john    900     123     csc     male
 debby   800     378     mth     male
 ken     800     234     csc     male
 sol     700     923     mth     female
 dare    900     273     phy     male
 olanna  800     283     csc     female
 olumba  400     245     phy     male
 petrus  800     284     mth     female

file2.tab

 regno   grade
 234     A
 283     D
 123     A
 273     B

I was able to get file3.tab with this command

awk 'NR==FNR{a[$1];next} $3 in a {print $0}' file2.tab file1.tab > file3.tab

file3.tab

 name    level   regno   dept    sex
 john    900     123     csc     male
 ken    800     234     csc     male
 dare     900     273     phy     male
 olanna     800     283     csc     female

But what I want is the whole of file1 row with file2 row attached to it like this

 name    level   regno   dept    sex    regno    grade
 john    900     123     csc     male    123    A
 ken    800     234     csc     male    234    A
 dare     900     273     phy     male    273    B
 olanna     800     283     csc     female    283    D

Secondly, I also want to get file3.tab in this format

 name    level   regno   dept    sex    grade
 john    900     123     csc     male    A
 debby   800     378     mth     male    NA
 ken     800     234     csc     male    A
 sol     700     923     mth     female    NA
 dare    900     273     phy     male    B
 olanna  800     283     csc     female    D
 olumba  400     245     phy     male    NA
 petrus  800     284     mth     female    NA

I used this command

awk 'FNR==NR{a[$1]=$1;next}{print $0, "\t" (($3 in a)? a[$1]:"NA")}' file2.tab file1.tab > file3-2.tab

But what I got is this and the grades from the file2.tab are not showing

 name    level   regno   dept    sex   
 john    900     123     csc     male   
 debby   800     378     mth     male    NA
 ken     800     234     csc     male   
 sol     700     923     mth     female    NA
 dare    900     273     phy     male   
 olanna  800     283     csc     female   
 olumba  400     245     phy     male    NA
 petrus  800     284     mth     female    NA

All files are tab-delimited. Kindly help me resolve these.

1

1 Answers

2
votes

You can use this awk command to achieve your output:

awk 'BEGIN{FS=OFS="\t"} NR==FNR{a[$1]=$2;next} {
     print $0, ($3 in a ? a[$3] : "NA")}' file2.tab file1.tab

name    level  regno  dept  sex     grade
john    900    123    csc   male    A
debby   800    378    mth   male    NA
ken     800    234    csc   male    A
sol     700    923    mth   female  NA
dare    900    273    phy   male    B
olanna  800    283    csc   female  D
olumba  400    245    phy   male    NA
petrus  800    284    mth   female  NA