awk | split the column and map with substring

Question

File1 (Big file with millions of records)

1,5,404920012,type1,info1
4,2,10000234,type2,info2
2,3,40492001223,type3,info3
3,1,50492001223,type4,info4

File2 (small file with only 10 lines)

40492=product1
10000=product2

Output

1,5,404920012,type1,product1,info1
4,2,10000234,type2,product2,info2
2,3,40492001223,type3,product1,info3
3,1,50492001223,type4,NOMATCH,info4

Notes:
I want to compare first 5 character of 3rd column from file1 to file 2
after matching fields, product1 should be in 5th column of output

What I have Tried:
Though I am new in awk advance programming, but what I have a learnt so far.. I can merge 2 files as follows:

BEGIN { FS = OFS = "," } FNR == NR { x=$1; $1 = ""; a[x] = $0; next }
{ if($3 in a)print $0","a[$3]; else print $0",,"; }

But issue with this is.. I dont know how to pass different FS (= in case of 2nd file)for 2nd file and how to combine substr function echo 404920012 | awk '{print substr($0,1,5)}' in if condition..

I have updated what I have tried and where I am stuck below my query.. — Vipin Choudhary

Håkon Hægland Håkon Hægland · Accepted Answer · 2013-11-29T09:26:16

Try

awk -f p.awk file2 file1

where p.awk is

BEGIN{
    FS=","
    OFS=","
}

NR==FNR {
    split($1,a,"=")
    keys[a[1]]=a[2]
    next
}
{
    k=substr($3,1,5)
    if (k in keys) 
        p=keys[k]
    else
        p="NOMATCH"
    $4=$4 OFS p
    print
}

with output:

1,5,404920012,type1,product1,info1
4,2,10000234,type2,product2,info2
2,3,40492001223,type3,product1,info3
3,1,50492001223,type4,NOMATCH,info4

awk | split the column and map with substring

2 Answers