Replace values from other columns if value of one column is null using awk/sed

Question

I am trying to modify a tab delimited file by replacing the value of 2 columns separated by ":" where there is a null value in one column:

Start with this file:

id1 id2 pos1    pos2
749 2847443 21  13517135
862 2821796 21  13571669
997     21  13636494
1095    2821826 21  13661335
1131        21  13678797

I would like to modify column 2 to get this:

id1 id2 pos1    pos2
749 2847443 21  13517135
862 2821796 21  13571669
997 21:13636494 21  13636494
1095    2821826 21  13661335
1131    21:13678797 21  13678797

The problem is also that there is no blank space (" ") indicating a null value. For now I have not even been able to substitute the value of column 2 with column 3 when it is null...

I have tried using sed:

sed -e 's/\t\t/\t$3\t/g' input.txt > output

Which works but it just substitutes the text '$3' and I can't find how to substitute the values of $3:$4 instead.

I have also tried awk:

awk 'BEGIN {
  IFS = OFS = "\t"
 }
{
    for (column = 2; column <= NF; ++column) {
    if ($column == "") {
        $column = $3
    }
 }    
 print 
}         
'
input.txt > output

But this does not work either (it does nothing actually neither using "" nor " "...)

Can you please help me? Thank you.

Kent Kent · Accepted Answer · 2012-01-07T18:06:45

checking null value is actually easy. but I didn't understand the requirement well. which column could be null? in your awk script, you have loop from $2->$NF, if there is null column, you didn't set to ":" separated values, but set with $3. what about $3 is null?

I assume that only $2 (column2) could be null, then the following awk line should do the job.

 awk -F'\t' -vOFS='\t' '!$2{$2=$3":"$4}1' file

test

<ff is your input file>

kent$  awk -F'\t' -vOFS='\t' '!$2{$2=$3":"$4}1' ff          
id1     id2     pos1    pos2
749     2847443 21      13517135
862     2821796 21      13571669
997     21:13636494     21      13636494
1095    2821826 21      13661335
1131    21:13678797     21      13678797

to see it clearly, we could pipe the output to column command:

kent$  awk -F'\t' -vOFS='\t' '!$2{$2=$3":"$4}1' ff|column -t
id1   id2          pos1  pos2
749   2847443      21    13517135
862   2821796      21    13571669
997   21:13636494  21    13636494
1095  2821826      21    13661335
1131  21:13678797  21    13678797

hope it is helpful for you.

Replace values from other columns if value of one column is null using awk/sed

2 Answers