0
votes

I have two files with multiple rows. 1st file have 2 colums and second file have more than 2000 colums. Example data sets are:

Example File_1

car Maruti
car TATA
car Hyundai
car Jaguar
Jeep Mahindra

Example File_2

car  A B C D E F G
Jeep X Y Z W Q W K

I have tried this

awk '{a[$1]=a[$1]" "$0} ++n[$1]==2{print a[$1]}' File_2 File_1

Tho output of this command print the data only single time and not in perfect order. I got the following result:

car  A B C D E F G car Maruti
Jeep X Y Z W Q W K Jeep Mahindra

Expected Output on the basis of 1st file is:

car  Maruti    car  A B C D E F G
car  TATA      car  A B C D E F G
car  Hyundai   car  A B C D E F G
car  Jauar     car  A B C D E F G
Jeep Mahindra  Jeep X Y Z W Q W K
2
Does the output have to be exactly in that order or would a sorted output be acceptable too?Socowi
Both the outputs are acceptable.Ravi Saroch

2 Answers

2
votes

Could you please try following.

awk 'FNR==NR{a[$1]=$0;next} ($1 in a){print $0,a[$1]}'  file2 file1 | column -t

Output will be as follows.

car   Maruti    car   A  B  C  D  E  F  G
car   TATA      car   A  B  C  D  E  F  G
car   Hyundai   car   A  B  C  D  E  F  G
car   Jaguar    car   A  B  C  D  E  F  G
Jeep  Mahindra  Jeep  X  Y  Z  W  Q  W  K
1
votes

It would be easier to use the join command here which is specifically made for tasks like yours. However, join requires sorted files, so we sort first.

With bash the sorting can be done in-place

join <(sort 1stFile) <(sort 2ndFile)

With a plain posix shell (sh) you have to use temporary files

 sort 1stFile > 1stFileSorted
 sort 2ndFile > 2ndFileSorted
 join 1stFileSorted 2ndFileSorted
 rm 1stFileSorted 2ndFileSorted

To align the columns in the output you can use join … | column -t.