2
votes

Note: I can control the format of the data file, but it has to be a single file.

I'm trying to plot multiple datasets on the same graph using gnuplot. I'd like ideally to plot something like this:

data_1 0 0
data_2 0 0
data_1 1 1
data_2 0 1
data_1 2 2
data_2 1 2

And so on. In this case, data_1 and data_2 should be two separate curves.

I'd also like to avoid putting in the gnuplot script the list, or even the number, of possible datasets. Basically, I'd like it to "group" data points by a specific field, and plot each group as a separate dataset on the same graph.

As a last-resort alternative, I could split the original file into one file per dataset using grep, and plot those (I guess it's easier?), but I'm looking for a way to do it with a single file.

2

2 Answers

4
votes

The gnuplot-way to save your data would be to separate the data sets with two empty lines. Then you can use index to access the different data sets in the single file:

data_1 0 0
data_1 1 1
data_1 2 2


data_2 0 0
data_2 0 1
data_2 1 2

And plot that with

plot 'file.dat' using 2:3 index 0, '' using 2:3 index 1

To get the number of data sets, use the stats command which saves the number of data sets (data blocks) in a variable which you can use for iterating:

stats 'file.dat' using 0 nooutput
plot for [i=0:(STATS_blocks - 1)] 'file.dat' using 2:3 index i

To extend this, you could even format your file as follows

data_1
0 0
1 1
2 2


data_2
0 0
0 1
1 2

and use the first line of seach data set as plot key:

set key autotitle columnheader
stats 'file.dat' using 0 nooutput
plot for [i=0:(STATS_blocks - 1)] 'file.dat' using 1:2 index i

enter image description here

1
votes

You can use an external program to fetch the values from the first column and then conditionally plot the data based on that.

For example, using python3 (and Windows style quotes), we can do1

values = system('python -c "data = sorted(set(x.split()[0] for x in open(\"datafile\",\"r\"))); print(\"\n\".join(data))"')

This will cause the variable values to contain "data_1 data_2". Now, we can use plot for looping over this variable. We test each line to see if the first column value is correct. If it isn't we use the value 1/0 which causes gnuplot to skip that line.

plot for [w in values] datafile u 2:((strcol(1) eq w)?$3:1/0) with points pt 7 t w

enter image description here

Of course, this causes discontinuities in the plots. If we don't want that (for example, using the lines style), we can use an external program to filter. For example, using awk (with Windows quotes)

plot for [w in values] sprintf('< awk "($1==\"%s\")" datafile',w) u 2:3 with lines t w

enter image description here

Here we use sprintf to build up redirection commands using awk for the index we are working with:

< awk "($1==\"data_1\")" datafile
< awk "($1==\"data_2\")" datafile


1
values = system('awk "{print $1}" datafile | sort | uniq')