4
votes

I would like to create a graph like this one using gnuplot (or matplotlib, if need be), but I don't know if/how it can be done:

rough outline of what the graph should look like

This, of course, is just a rough sketch. What is important is that I need to plot pairs of values (in this example, each pair consists of a red and a blue dot). One item from each pair is a single value, the other one is supposed to show a range of values (my idea was to plot the mean value with error bars to indicate the range's max and min, but I'm open to better ideas). The x-axis has no purpose other than giving the names of the various categories–all that matters are the y-values.

I am pretty sure I could create something like this (value pairs and x-categories) using histograms, but boxes just strike me as wrong in this case.

What I have got so far: I have this gnuplot command:

plot 'TEST.out' using 0:2:3:xticlabel(1) w errorbars pt 7 notitle

Used with this data file (category name, y-value, error bar value):

cat1 15 0
cat1 18 3
cat2 13 0
cat2 10 4

it yields below plot, which goes in the right direction, but which is not yet ideal (all data points have the same colour, and for the single values you can still see that error bars were used; also the grouping isn't very nice–if the two points making up one pair were closer to each other that would make the plot easier on the eye).

where I have got so far

If anyone has any suggestions (even for creating a graph that does not look exactly like the example I gave in the beginning). I would be very grateful.

Thanks a lot for your time!

2

2 Answers

5
votes

In Matplotlib, you can set ticks individually. The method is explained in one of Matplotlib's examples.

Here is a start:

# Initializations:
from matplotlib import pyplot as pp
import numpy as np

pp.ion()  # "Interactive mode on": pyplot.* commands draw immediately

# Data:
series_red_y = [1.3, 1.4, 2.2]
series_blue_y = [1.6, 1.8, 1.8]
series_blue_err = [0.25, 0.25, 0.5]
names = ('Category 1', 'Category 2', 'Category 3')

# Where on the x-axis the data will go:    
series_red_x = np.arange(0, 3*len(series_red_y), 3)  # Step of 3: red dot, blue dot, empty space
series_blue_x = np.arange(1, 3*len(series_blue_y)+1, 3)  # Step of 3: red dot, blue dot, empty space

# Plotting:
pp.scatter(series_red_x, series_red_y, c='r', s=100)
pp.scatter(series_blue_x, series_blue_y, s=100)
pp.errorbar(series_blue_x, series_blue_y, yerr=series_blue_err, fmt=None,
            capsize=0)
pp.xticks((series_red_x+series_blue_x)/2., names)

# We wait until the user is ready to close the program:
raw_input('Press enter...')

Here is the result, which you can customize to your specific needs:

enter image description here

1
votes

Here is a gnuplot solution:

set xrange [-.25:2.75]
set xtics ('cat0' .25, 'cat1' 1.25, 'cat2' 2.25)
plot '< sed 1~2!d TEST.out' using 0:2:(.1) with circles, \
     '< sed 1~2d TEST.out' using (.5+$0):2:3 with errorbars

You might need to generate the first two lines of the script if the number of categories is not known beforehand. It should be rather easy, e.g. in bash:

cat_count=$(wc -l < TEST.out)
let cat_count/=2
let cat_count--
echo xrange "[-.25:$cat_count.75]"
echo -n 'set xtics('
for i in $(seq 0 $cat_count) ; do
    echo -n "'cat$i'" $i.25
    ((i==cat_count)) || echo -n ,
done
echo ')'