2
votes

File1 (Big file with millions of records)

1,5,404920012,type1,info1
4,2,10000234,type2,info2
2,3,40492001223,type3,info3
3,1,50492001223,type4,info4

File2 (small file with only 10 lines)

40492=product1
10000=product2

Output

1,5,404920012,type1,product1,info1
4,2,10000234,type2,product2,info2
2,3,40492001223,type3,product1,info3
3,1,50492001223,type4,NOMATCH,info4

Notes:
I want to compare first 5 character of 3rd column from file1 to file 2
after matching fields, product1 should be in 5th column of output

What I have Tried:
Though I am new in awk advance programming, but what I have a learnt so far.. I can merge 2 files as follows:

BEGIN { FS = OFS = "," } FNR == NR { x=$1; $1 = ""; a[x] = $0; next }
{ if($3 in a)print $0","a[$3]; else print $0",,"; }

But issue with this is.. I dont know how to pass different FS (= in case of 2nd file)for 2nd file and how to combine substr function echo 404920012 | awk '{print substr($0,1,5)}' in if condition..

2
I have updated what I have tried and where I am stuck below my query..Vipin Choudhary

2 Answers

1
votes

Try

awk -f p.awk file2 file1

where p.awk is

BEGIN{
    FS=","
    OFS=","
}

NR==FNR {
    split($1,a,"=")
    keys[a[1]]=a[2]
    next
}
{
    k=substr($3,1,5)
    if (k in keys) 
        p=keys[k]
    else
        p="NOMATCH"
    $4=$4 OFS p
    print
}

with output:

1,5,404920012,type1,product1,info1
4,2,10000234,type2,product2,info2
2,3,40492001223,type3,product1,info3
3,1,50492001223,type4,NOMATCH,info4
2
votes

This is a case when, since it varies between input files, you want to set FS to the appropriate value between files rather than up front in the awk command arguments using -F or -v FS=:

awk '
NR==FNR { map[$1] = $2; next }
{
    key = substr($3,1,5)
    $5 = (key in map ? map[key] : "NOMATCH") OFS $5
    print
}
' FS='=' File2 FS=',' OFS=',' File1

Just in case: This is just briefer than the answer you selected, functionally it's equivalent so please don't change to select this one.