Subtract multiple comma-separated values in two tab-separated columns using awk

Question

I have the following table

textA   textB   1,21,41 10,30,50
textC   textB   2,22,42,62  10,30,50,70

The values from the third column should be subtracted from those in the fourth column element-wise, i.e. 10-1, 30-21, 50-41; then 10-2, 30-22, 50-42, 70-62. The desired result should be printed in the fifth tab-separated column. The output table should look like

textA   textB   1,21,41 10,30,50    9,9,9
textC   textB   2,22,42,62  10,30,50,70 8,8,8,8

I have tried to combine some awk code lines:

(Pseudo-)code line one could theoretically subtract multiple values in one column, independently of how many values are in the column

awk '{for(i=1;i<=NF;i++)x-=$i;print x}' fileA

I now generate two independent files based on my third and fourth column

awk -F'\t' '{print $3}' fileA > fileB
awk -F'\t' '{print $4}' fileA > fileC

(Pseudo-)Code line three could theoretically process values from different files

awk 'NR==FNR{a[NR]=$1;next}{print $1+a[FNR],$2}' file1 file2

I try to combine code line one with code line three:

awk 'NR==FNR{a[NR]=$1;next}{print $1+a[FNR],{for(k=1;k<=NF;k++)z-=$i;print z}$2}' fileB fileC

That's where I got stuck. I'd be happy for any ideas.

Tom Fenech Tom Fenech · Accepted Answer · 2016-10-15T14:21:32

No need for any temporary files. This can be done using one invocation of awk:

BEGIN {
    FS = OFS = "\t"
}

{
    n = split($3 "," $4, a, /,/) / 2
    printf "%s%s", $0, OFS
    for (i = 1; i <= n; ++i)
        printf "%d%s", a[i+n]-a[i], (i<n?",":ORS)
}

Split the third and fourth columns on a comma. Print the line, followed by a tab, followed by the results of each subtraction.

It is assumed that the number of numbers in the third and fourth column are equal.

Run the script like awk -f script.awk file.

Subtract multiple comma-separated values in two tab-separated columns using awk

1 Answers