2
votes

I want to plot a tsv file with some data missing. I'm looking for a way to have gnuplot recognize empty data. As far as I've tried, gnuplot seems to handle line a\t\tc as [a,c], but I want it to think of the data as [a,empty,c].

I have a file, data.tsv, like this:

data1     data2     data3
1.2       12.4      129.3
2.4       32.4      134.8
3.2                 121.5
3.4       15.4      214.5

Please consider the data is separated with \t. That is, the line 0 is "data1\tdata2\tdata3\n", the line 3 is '3.2\t\t121.5\n' . Note that in the line 3, the value for data2 is missing.

When I tell gnuplot

set datafile separator "\t"
plot "data.tsv" using 1:2

gnuplot plots the data, yes, but it uses the value for data3 on the line 3, resulting in plotting [1.2, 2.4, 3.2, 3.4] versus [12.4, 32.4, 121.5, 15.4].

I would like to plot [1.2, 2.4, 3.4] versus [12.4, 32.4, 15.4]. At least, I don't want 121.5 getting into the plot. How can I do that?

2

2 Answers

3
votes

I guess this is a bug in your gnuplot version (which one do you have?). Your minimal script

set datafile separator "\t"
plot "data.tsv" using 1:2

works fine for me on Linux since version 4.6.4. With 4.6.3 and previous versions I do also see your problem.

2
votes

If you're stuck on an older version of gnuplot and want to get round this problem, you could change the format of your data using awk:

$ awk -F'\t' -v OFS=',' '{$1=$1}1' data.tsv
1.2,12.4,129.3
2.4,32.4,134.8
3.2,,121.5
3.4,15.4,214.5

This short awk script converts tabs in the input to commas in the output. The $1=$1 assignment is just a way of getting awk to touch the file. The 1 at the end always evaluates to true, so the line is printed.

You can use this directly in gnuplot:

set datafile separator ","                                   
plot "<awk -F'\t' -v OFS=',' '{$1=$1}1' data.tsv" using 1:2