2
votes

I have a gnuplot script that produces bar graphs like this:

bar example

The input data is in files that have a number of columns, each column ultimately contributes to a cluster in the chart (2 clusters shown in the example). Each file contributes to a bar in the chart (there are 9 in the example). Each file may have a large number of rows.

The script takes the input data files and, using the stats command, produces new files containing one row per column of the original files. Each row contains a mean, min and max value for its source column.

These new files are then used to plot the bar chart with error bars. Each file represents one bar and each row contributes to one cluster. The plot code is as follows:

plot for [f in FILES] f.'.stats' using 2:3:4 title columnhead(1), \
     '' using (0):xticlabels(1) with lines

Now I have a second set of files and that produce another similar bar chart. I would like to combine these charts onto one so there will be two rows of 3-D bars, one in front of the other (rendered with a 3-D style - the new 'z' axis representing the two data sets (two sets of FILES).

Here is an example to illustrate the look I'm after (obviously not made with gnuplot!):

enter image description here

Can I do this with Gunplot?

I have read the user manual and the Gnuplot In Action book but haven't found anything that would indicate this is possible.

3

3 Answers

4
votes

gnuplot version 5.3 (the development branch) adds a 3D barchart variant 3D boxes demo. However rendering the boxes in 3D unfortunately depends on features that were not present in earlier gnuplot release versions so I cannot offer a work-around for the current one (5.2.4). Also the new 3D variant does not show error bars, although I think one could construct a plot command that would add them. enter image description here

3
votes

I produced a 3D bar chart using the development 5.3 version (git checkout). Here is my splot command:

splot for [c = 1:ncats] for [f = 1:nfiles] \
          word(cat_files[c],f).'.stats' \
          using (f+column(0)*(nfiles+2)):(scale_y(c)):2 \
          with boxes \
          title (c==1 ? columnhead(1) : '') 

The input data is in a set of 'stats' files as described in the question. To draw the plot, I separated the input FILES into categories - two (ncats) sets of files held in the array cat_files, each containing the same number of files (nfiles).

The categores equate to positions on the y-axis (rows) and the individual files equate to positions on the x-axis (bars). Rows in each file equate to clusters of bars and the values in each row is the bar height which is the Z axis. The Z axis was the Y axis in the 2D model. The nasty expressions are to position the bars on the x and y axes as I explain below.

I had a lot of difficulty getting this to work but I think that the result looks good:

enter image description here

The problems, which I cover below are:

  • matching colours between chart 'rows' of the y-axis
  • bar dimensions - making square bars is very hit-and-miss, hence my scale_y function.
  • x-axis label orientation
  • repeated items in the key, hence conditional expression for title.
  • no clustering support, hence nasty positioning expressions

What I have is brittle---it works on my Linux system but relies heavily on shell helpers. But it works. Hopefully this information helps others or can be taken as feedback to improve gnuplot to make it even more awesome!

Colours

To get the colours in each data set to line up, I set linetype cycle nfiles and hope gnuplot defines sufficient colours.

The reason for doing this is to reset the colour assignment between file sets (categories on the y-axis) so that the same bar in different file sets had the same colour. By explictly setting it to cycle after the known number of files (chart bars) I ensured the colours matched.

Bar dimensions

The bar dimensions (boxwidth and boxdepth) are relative to the axis ranges and it's therefore difficult to make them square.

If a bar rests on the extreme of the y axis (lower or upper) then it is cut in half vertically (it's visible box depth is half the defined boxdepth value).

I had to play with scaling the y axis so that my two category sets were displayed near each other. The default behaviour displayed a range from 1 to 2 in steps of 0.2 and placed the two plots at 1 and 2, making them appear far apart.

I tried set ytics to no effect. I ended up scaling the y value.

scale(y) = 0.1 * y - 0.05
set yrange [0:1]
set boxdepth (0.8 / clusters)

all the numbers are fudge factors. clusters is the number of clusters (rows in files). The numbers I have maintain a square appearance with my test data (I have data to display up to 5 clusters).

I had to start the x axis at 0.5 otherwise the first bar would appear too far in (if x starts at 0) or vertically half-cut off (if x starts at 1).

set xrange [0.5:*]

Axis labels

I replaced the automatic tick marks with custom labels. On the Y axis:

 set ytics ()
 set for [c = 1:ncats] ytics add (word(CATS,c) scale_y(c) )

Similarly for the x axis. First, where there is 1 cluster I label each category

 set xtics ()
 set for [f = 1:nfiles] xtics add (label(word(cat_files[1],f)) f) 

Or where there are multiple clusters, I label the clusters:

 set xtics ()
 set set for [c = 2:(clusters+1)] xtics add (cell(f,c,1) (nfiles/2)+2+((c-2)*nfiles))

Here, cell is a shell helper that returns the value from file f at row c position 1. The horrible formula is a hack to position the label along the axis in the middle of the cluster. I also use shell helpers to get the number of clusters. I could not find a way in gnuplot to query rows and columns. Note that previously (when 2D plotting) I would have used xticlabels(1) to plot the clusered x-axis.

I wanted to turn the x labels to run perpendicular to the axis but this doesn't seem possible. I also wanted to tweak their positions with 'right' alignment but couldn't make that work either.

Key labels

An entry is added into the key for each bar plotted. Because these are repeated within each category they get duplicted in the key. I made it only add them once by using a conditional, changing from

title columnhead(1)

to

title (c==1 ? columnhead(1) : '')

I only show the key when there is more than one cluster.

Clustering

The 2D plot was clustered. I had difficulty making a clustered appearance in 3D. If I run the plot on clustered data then they overlay (they have the same Y values). To overcome this I used a formula to shift latter clusters along the x-axis and add a gap between them. So instead of a simple value for x:

... using (f):(scale_y(c)):2 ...

I have a formula:

... using (f+column(0)*(nfiles+2)):(scale_y(c)):2 ...

where f is the file number (eq. the bar number), column(0) is the cluster number, nfiles is the number of files (eq. the numer of bars, or cluster size), and 2 is the separator gap.

Incidentally, whilst doing this I discovered that ($0) doesn't work in gnuplot 5.3, you have to use column(0) instead ($0 works in 5.2.4).

I used the Arch Linux AUR package to build which gave me a package gnuplot-git-5.3r20180810.10527-1-x86_64.pkg.tar.xz.

enter image description here

An example plot with one cluster.

enter image description here

An example plot with three clusters and a key legend.


There are probably better ways to do the things I've done here. Being relatively new to gnuplot, I would be interested in any ways to improve upon this solution.

0
votes

(I can't figure out how to format text in a comment, so I'll provide this as a separate answer)

  • Matching color: This is more reliably done by providing the color in a separate field of the using spec. From the help text:

    splot with boxes requires at least 3 columns of input data. Additional input columns may be used to provide information such as box width or fill color.

      3 columns:  x  y  z
      4 columns:  x  y  z  [x_width or color]
      5 columns:  x  y  z  x_width  color
    

    The last column is used as a color only if the splot command specifies a variable color mode. Examples

      splot 'blue_boxes.dat' using 1:2:3 fc "blue"
      splot 'rgb_boxes.dat' using 1:2:3:4 fc rgb variable
      splot 'category_boxes.dat' using 1:2:3:4:5 lc variable
    

    In the first example all boxes are blue and have the width previously set by set boxwidth. In the second example the box width is still taken from set boxwidth because the 4th column is interpreted as a 24-bit RGB color. The third example command reads box width from column 4 and interprets the value in column 5 as an integer linetype from which the color is derived.

  • Half-depth boxes at each end: This was an autoscaling bug (now fixed)