2
votes

I have a stacked percentage barplot in ggplot, I'd like to put the total observation number on top of the stacked bars (while keeping the stacked bars in percentages). Yet I keep running into problems.

Below is my code to produce the percentage barplot:

# sample dataset
    set.seed(123)
    cat1<-sample(letters[1:3], 500, replace=T, prob=c(0.1, 0.2, 0.65))
    cat2<-sample(letters[4:8], 500, replace=T, prob=c(0.3, 0.4, 0.75, 0.5, 0.1))
    df <- data.frame(cat1, cat2)

# the barplot
    ggplot(df, aes(x=cat1))+
    geom_bar(aes(fill = cat2),
                    position = 'fill',color = "black")+
    scale_y_continuous(labels = scales::percent)+
    labs ( y = "Percentage")+
      # this final line is me trying to add the label
      geom_text(aes(label=cat1))

# this is the observation number I want display
    table(df$cat1)

    # but I get this error:
Error: geom_text requires the following missing aesthetics: y

so I have 2 questions:

  1. how do I put the total observation number for each of cat1 "N=" label on top of each stacked bar)?
  2. What exactly is the y for the barplot in my code(aes(x=...))? I have x, but no y, but the plot seems to work..

thanks!

2
@Les H: thanks for the post, unfortunately, it doesn't solve my problem. I get error saying that "value" and "ymax" are unknown parameters when I try to do geom_text(aes(label =value,ymax=0))debbybeginner

2 Answers

3
votes

you could try

temp <- data.frame(x=c("a", "b", "c"), y=c(1.02, 1.02, 1.02), z=c(51, 101, 348))

   ggplot(df, aes(x=cat1))+
    geom_bar(aes(fill = cat2),
                    position = 'fill',color = "black")+
    scale_y_continuous(labels = scales::percent)+
    labs ( y = "Percentage")+
      # this final line is me trying to add the label
      geom_text(data=temp, aes(x=x, y=y, label=as.factor(z)))

enter image description here

3
votes

If you don't want to hardcode your summary labels, here's a slightly different approach (but still a bit of a hack) using dplyr to calculate your percentages and format your labels.

I've also reversed your legend to match the order on the chart :)

library(dplyr)

df2 <- df %>%
  group_by(cat1, cat2) %>%
  summarise(n=n())%>%
  mutate(percent = (n / sum(n)), cumsum = cumsum(percent), label=ifelse(cat2=="h", paste0("N=", sum(n)),""))

  ggplot(df2,aes(x=cat1, y=percent, fill=cat2)) +
    scale_y_continuous(labels = scales::percent) +
    labs ( y = "Percentage") +
    geom_bar(position = 'fill',color = "black", stat="identity") +
    geom_text(aes(y=cumsum, label=label), vjust=-1) +
    guides(fill=guide_legend(reverse=T))

enter image description here