0
votes

cat file1:

a
b
c
d
e

cat file2:

a  10
c  20
e  30
f  40

The desired output file is:

a  10
b
c  20
d
e  30
f  40

I've tried using awk but I ended with all the lines of file1 repeated. Many thanks

3
What's the logic that governs that output? - glenn jackman
"I've tried using awk" .. you need to include code in your Q so we can help fix it! Else try searching on [linux] join or [bash] join and read man join. Good luck. - shellter

3 Answers

3
votes

Read in both files and hash them to an array (a). If you read in file1 before file2 the collisions will be in your favor. In awk:

$ awk '{a[$1]=$0} END{for(i in a) print a[i]}' file1 file2
a  10
b
c  20
d
e  30
f  40

Explained:

{
    a[$1]=$0        # hash all records to a 
} 
END {               # after processing both files
    for(i in a)     # iterate thru every key in a
        print a[i]  # and output their values
}

Due to the nature of for(i in a) the output order is random.

2
votes

another awk and less smart way to do this

$ awk 'NR==FNR {a[$1]=$2; next} 
       $1 in a {$2=a[$1]; delete a[$1]} 
               1; 
       END     {for(k in a) print k,a[k]}' file2 file1

a 10
b
c 20
d
e 30
f 40
0
votes

Not sure if awk is a hard requirement for the OP but here's an alternative solution using join which seems more appropriate for the job at hand (as @shellter also pointed out):

$ join -a1 -a2 file1 file2