0
votes

I'm looking to make a stacked barchart with colors representing values from a separate data column as well as add an accurate color bar using just the base graphics in R. There is one other post about this but it is pretty disorganized and in the end doesn't help me answer my question.

# create reproducible data
d <- read.csv(text='Day,Location,Length,Amount
            1,4,3,1.1
            1,3,1,.32
            1,2,3,2.3
            1,1,3,1.1
            2,0,0,0
            3,3,3,1.8
            3,2,1,3.54
            3,1,3,1.1',header=T)

# colors will be based on values in the Amount column
v1 <- d$Amount
# make some colors based on Amount - normalized
z <- v1/max(v1)*1000
colrs <- colorRampPalette(c('lightblue','blue','black'))(1000)[z]

# create a 2d table of the data needed for plotting
tab <- xtabs(Length ~ Location + Day, d)
# create a stacked bar plot
barplot(tab,col=colrs,space=0)

# create a color bar
plotr::color.bar

This for sure produces a color coded stacked bar graph, but the colors do not represent the data accurately.

For Day 1, Locations 4 and 1 should be identical in color. Another example, the first and last entries in the Amount column are identical, but color of the top of the left column doesn't match the bottom of the right column.

Also, I found how to make a color bar on a different post and it uses the plotr::color.bar code, but plotr apparently isn't a package and I'm not sure how to carry on.

How can I get the colors to match the appropriate section and add an accurate color bar?

3

3 Answers

1
votes

I hope the "pretty disorganized" post is not my answer to How to create a time series plot in the style of a horizontal stacked bar plot in r! That's fine, no offense taken.

The solution can be adapted to your data as follows:

## store data
df <- read.csv(text='Day,Location,Length,Amount\n1,4,3,1.1\n1,3,1,.32\n1,2,3,2.3\n1,1,3,1.1\n2,0,0,0\n3,3,3,1.8\n3,2,1,3.54\n3,1,3,1.1',header=T);

## extract bar segment lengths from Length and bar segment colors from a function of Amount, both stored in a logical matrix form
lengths <- xtabs(Length~Location+Day,df);
amounts <- xtabs(Amount~Location+Day,df);
colors <- matrix(colorRampPalette(c('lightblue','blue','black'))(1001)[amounts/max(amounts)*1000+1],nrow(amounts));

## transform lengths into an offset matrix to appease design limitation of barplot(). Note that colors will be flattened perfectly to accord with this offset matrix
lengthsOffset <- as.matrix(setNames(reshape(cbind(id=1:length(lengths),stack(as.data.frame(unclass(lengths)))),dir='w',timevar='ind')[-1],colnames(lengths)));
lengthsOffset[is.na(lengthsOffset)] <- 0;

## draw plot
barplot(lengthsOffset,col=colors,space=0,xlab='Day',ylab='Length');

offset-stacked-barplot


Notes

  • In your question, you tried to build a color vector using colrs <- colorRampPalette(c('lightblue','blue','black'))(1000)[z] with z being the 8 original Amount values converted to "per mille" form. This had a slight flaw, in that one of the z elements was zero, which is not a valid index value. That's why you got 7 colors, when it should have been 8. I fixed this in my code by adding 1 to the per mille values and generating 1001 colors.
  • Also related to generating colors, instead of just generating 8 colors (i.e. one per original Amount value), I generated a complete matrix of colors to parallel the lengths matrix (which you called tab in your code). This color matrix can actually be used directly as the color vector passed to barplot()'s col argument, because internally it is flattened to a vector (at least conceptually) and will correspond with the offset bar segment lengths that we'll pass to barplot() for the height argument (see next note).
  • The linchpin of this solution, as I describe in more detail in my aforementioned post, is creating an "offset matrix" of the bar segment lengths with zeroes in adjacent columns, such that a different color can be assigned to every segment. I create this as lengthsOffset from the lengths matrix.
  • Note that, perhaps somewhat counter-intuitively, lower index values in the height argument are drawn by barplot() as lower segments, and vice-versa, meaning the textual display when you print that data in your terminal is vertically reversed from how it appears in the bar plot. You can vertically reverse the lengthsOffset matrix and the colors vector if you want the opposite order, but I haven't done this in my code.

For reference, here are all the data structures:

df;
##   Day Location Length Amount
## 1   1        4      3   1.10
## 2   1        3      1   0.32
## 3   1        2      3   2.30
## 4   1        1      3   1.10
## 5   2        0      0   0.00
## 6   3        3      3   1.80
## 7   3        2      1   3.54
## 8   3        1      3   1.10
lengths;
##         Day
## Location 1 2 3
##        0 0 0 0
##        1 3 0 3
##        2 3 0 1
##        3 1 0 3
##        4 3 0 0
amounts;
##         Day
## Location    1    2    3
##        0 0.00 0.00 0.00
##        1 1.10 0.00 1.10
##        2 2.30 0.00 3.54
##        3 0.32 0.00 1.80
##        4 1.10 0.00 0.00
colors;
##      [,1]      [,2]      [,3]
## [1,] "#ADD8E6" "#ADD8E6" "#ADD8E6"
## [2,] "#4152F5" "#ADD8E6" "#4152F5"
## [3,] "#0000B3" "#ADD8E6" "#000000"
## [4,] "#8DB1EA" "#ADD8E6" "#0000FA"
## [5,] "#4152F5" "#ADD8E6" "#ADD8E6"
lengthsOffset;
##    1 2 3
## 1  0 0 0
## 2  3 0 0
## 3  3 0 0
## 4  1 0 0
## 5  3 0 0
## 6  0 0 0
## 7  0 0 0
## 8  0 0 0
## 9  0 0 0
## 10 0 0 0
## 11 0 0 0
## 12 0 0 3
## 13 0 0 1
## 14 0 0 3
## 15 0 0 0
0
votes

Based on the comments below:

library(ggplot2)
ggplot(d, aes(x = Day, y = Length)) + geom_bar(aes(fill = Amount, order = Location), stat = "identity") 
0
votes

I think this was a mistake with defining colors, barchart needs only 5 colors, as there are 5 locations and one of colors won't be used as location 1 has zero elements everyday.

Fix:

colrs <- colorRampPalette(c('yellow', 'lightblue','blue','black', 'lightblue'))(5)

output after fixing colrs vector

Notice that 'yellow' isn't drawn as there are 0 observations in it's group (in sample data from OP)