2
votes

let's say I have a lot of data of the form

<x-value> <y-value> <standard deviation>

And when I say "a lot" then I mean too much to meaningfully plot every single coordinate with yerrorbars.

I should now like to create a boxplot for, say, every 50 lines (so one at 25 for lines [1..50], one at 75 for lines x=[51..100], etc, etc.).

What is the quickest way to do that with gnuplot?

EDIT

Apparently, my question was not precise enough:

I want to take all values for x=1..50, pretend they all happened at 25, create a boxplot. Take all values for x=51..100, pretend they all happened at 75 and create a boxplot, and so on.

1
@Christoph not applicable because I'm not interested in a sum of any kind.User1291
I said I wanted boxplots, not the averages. I.e. take all values for x=1..50, pretend they all happened at 25, create a boxplot. Take all values for x=51..100, pretend they all happened at 75 and create a boxplot, and so on.User1291
Sorry, I misread the question, I hopefully got you right now, see my answer.Christoph

1 Answers

1
votes

Gnuplot can calculate boxplots and group them by some common string. You could create these string with a function like

level(x) = sprintf("%d", (int(x)/50)*50 + 25)

With this function each 50 values are combined to a new boxplot:

set style data boxplot
level(x) = sprintf("%d", (int(x)/50)*50 + 25)
plot 'file.dat' using (1):2:(0):(level($1)) notitle

Note, that here I group by the x-value. If you want to use the row number, replace the $1 with $0. Also note, that each boxplot gets its own xticlabel, which you cannot avoid with this way.

As test case consider the following script (because of the dynamic data generation this test case only on Unix platforms):

set style data boxplot
set style fill solid 0.25 noborder
set boxwidth 0.8
level(x) = sprintf("%d", (int(x)/50)*50 + 25)

plot "< seq 0 299 | awk '{print $1,$1}'" using (1):2:(0):(level($1)) notitle 

enter image description here