Draw a curve of best fit with gnuplot

Question

Suppose i have a set of points x,y to plot for an image with gnuplot.It works as expected and i get a nice curve.I want to repeat the experiment for a large dataset of images (say 1000).At this point you would get 1000 curves on one plot, each curve for one image.How do i tell gnuplot to draw a best fit of the curves?

I would like the gnuplot to give me the x,y point of the best fit curve in a csv as i plan to have a single plot of best fits later.

The data can be found here

What's the mathematical model for the fitting? If you have gnuplot draw a curve through a set of points, then my recollection is that it isn't finding any mathematical relationship between the points, but just using a cubic spline or other simple fuction to fit the curves on a point-by-point basis. You could somehow "average" these curves, but it wouldn't be a "best fit" in any mathematically rigorous sense. To have a best fit you have to have a mathematical model that you take your data to be a measurement of. Is your data related in some mathematical way? — Kevin Boone
@KevinBoone with fit you can fit any function to your data. — Christoph
@KevinBoone how do i do a good average of the curve then.I don't have the mathematical function. — user2650277
Hard to say without looking at the data. A general approach is to divide the x range into small blocks (the number needed will depend on the number of data points), then take an average of the y values of all the data points that fall into each x region. Then plot a single line using the centres of the x regions and the corresponding y averages as the data points. To be honest, I can't remember whether gnuplot can do this calculation automatically. This approach gives the same kind of line you would draw if you just "eyeballed" the data; however, it has no mathematical validity at all. — Kevin Boone
@KevinBoone the data and plots can be found @ encode.ru/threads/… — user2650277

user8153 user8153 · Accepted Answer · 2017-10-10T04:54:36

If I understand you correctly you want to draw an average line through the data, rather than fitting the data for function. You can do this using the smooth option to the plot command.

Depending on your needs you could draw an interpolation function through your data. For example:

plot \
"libjpeg-2000-bench.png.csv" u 3:5 w p, \
"libjpeg-2000-mural.png.csv" u 3:5 w p, \
"libjpeg-2000-red-room.png.csv" u 3:5 w p, \
"libjpeg-bench.png.csv" u 3:5 w p, \
"libjpeg-mural.png.csv" u 3:5 w p, \
"libjpeg-red-room.png.csv" u 3:5 w p, \
 "< tail -q -n +4  libjpeg*csv" u 3:5 smooth acsplines   w l lw 2

gives

You might want to experiment with the various smoothing functions, see help smooth. Some of those functions also take additional parameters. For example, you can specify a weight for the acsplines interpolation:

plot \
"libjpeg-2000-bench.png.csv" u 3:5 w p, \
"libjpeg-2000-mural.png.csv" u 3:5 w p, \
"libjpeg-2000-red-room.png.csv" u 3:5 w p, \
"libjpeg-bench.png.csv" u 3:5 w p, \
"libjpeg-mural.png.csv" u 3:5 w p, \
"libjpeg-red-room.png.csv" u 3:5 w p, \
"< tail -q -n +4  libjpeg*csv" u 3:5:(100) smooth acsplines title "acsplines, weight = 100" w l lw 2,  \
"< tail -q -n +4  libjpeg*csv" u 3:5:(0.1) smooth acsplines title "acsplines, weight = 0.1" w l lw 2

The choice of the weight involves a trade-off: if the weight is large then the curve will follow the data points more closely, but will likely exhibit oscillations.

Alternatively you can bin the data points in the x direction, and average those data points that fall within the same bin. Luckily you can do all this from within gnuplot:

round(x) = floor(x+0.5)
bin(x,binwidth) = binwidth*round(x/binwidth)
binwidth = 1.
plot \
"libjpeg-2000-bench.png.csv" u 3:5 w p, \
"libjpeg-2000-mural.png.csv" u 3:5 w p, \
"libjpeg-2000-red-room.png.csv" u 3:5 w p, \
"libjpeg-bench.png.csv" u 3:5 w p, \
"libjpeg-mural.png.csv" u 3:5 w p, \
"libjpeg-red-room.png.csv" u 3:5 w p, \
 "< tail -q -n +4  libjpeg*csv"  u (bin($3,binwidth)):5 smooth uniq  w l lw 2

gives

Here you can adjust the binsize binwidth to your needs.

Draw a curve of best fit with gnuplot

2 Answers