2
votes

I'm having a problem formatting ggplot2 to display a stacked bar plot with cumulative percent on the Y axis and counts within the bars. I can do one plot of each type (one with percent on the Y axis, one with counts in the bars) but not both. Here's what I have:

group <- c(1,1,1,2,2,2)
ind <- c(1,2,3,1,2,3)
count <- c(98,55,10,147,31,3)
df <- data.frame(group, ind, count)

library(ggplot2)
library(scales)

ggplot(df, aes(y=count, x=factor(group), fill=factor(ind), label=cfreq)) +
geom_bar(stat = "identity") + ylab("Percent Level 1 Classes") +
scale_fill_discrete(name="Level 1\nClasses") +
xlab("Level 2 Groups") +
geom_text(size = 3, position = position_stack(vjust = 0.5))

This produces the following plot with counts but no percent on Y axis:

bar plot 1

The second version of the plot produces the percent on the Y axis but no counts in the bars:

ggplot(df, aes(y=count, x=factor(group), fill=factor(ind))) +
geom_bar(position = "fill", stat = "identity") + 
ylab("Percent Level 1 Classes") +
scale_fill_discrete(name="Level 1\nClasses") +
xlab("Level 2 Groups")

bar plot 2

But I can't get it to do both. Rather than waste space, I did try "label=cfreq" in the "aes" statement to no avail--seems to conflict with the "geom_text" option. Any help would be greatly appreciated.

1

1 Answers

6
votes

This seems to do what you want:

group <- c(1,1,1,2,2,2)
ind <- c(1,2,3,1,2,3)
count <- c(98,55,10,147,31,3)
df <- data.frame(group, ind, count)

library(ggplot2)
library(scales)

ggplot(df, aes(y=count, x=factor(group), fill=factor(ind))) +
  geom_bar(position = "fill", stat = "identity") +
  geom_text(aes(label = count), position = position_fill(vjust = 0.5)) +
  ylab("Percent Level 1 Classes") +
  scale_fill_discrete(name="Level 1\nClasses") +
  xlab("Level 2 Groups")

Stacked labels

I think by default the position is set to "identity", and "stack" doesn't solve the problem either because the labels seem to be on the original scale of the counts, not the percents, so the bars get shrunk down to basically a line at the bottom of the plot. Using vjust = 0.5 centers the labels, since the default is 1, which puts them at the top of the bars.