2
votes

My problem is to show histogram column percentage with group (in ggplot). I have two group and I plot histogram

Example:

ggplot(data, aes(x = variable)) +
    geom_histogram(aes(group = grp, fill = grp, y=(..count../sum(..count..))), col = NA, alpha = 0.35) +
    stat_bin(aes(y=(..count../sum(..count..)), label=(..count../sum(..count..)), group=grp), geom="text",position=position_stack(vjust=0.5))

When I run above I get total percentage label=(..count../sum(..count..)) but I would like to see percentage in histogram columns (bins), which sum in any bin to 100%

For instance:

  • Bin1: 20%;80%
  • Bin2: 30%;70% etc..
  • etc...

If you would like using mtcars, catch this example code:

ggplot(mtcars, aes(x = qsec)) +
  geom_histogram(aes(group = am, fill = am, y=(..count../sum(..count..))), col = NA, alpha = 0.35) +
  stat_bin(aes(y=(..count../sum(..count..)), label=(..count../sum(..count..)), group=am), geom="text",position=position_stack(vjust=0.5))

Have you any idea?

1
Please share a little bit of sample data to make this question reproducible. dput() is nicest for sharing data because it is copy/pasteable, e.g., dput(data[1:10, ]) for the first 10 rows. Please choose a nice small subset that illustrates the problem.Gregor Thomas
I have provided with data exampleidiayp
Switching ..count../sum(..count..) to ..ndensity.. seems to be closer, but I'm confused because it doesn't seem to always be what you want....Gregor Thomas

1 Answers

2
votes

Maybe this is what you are looking for. My approach uses an auxilliary function and some dplyr to compute the proportions of each group per bin:

library(ggplot2)
library(dplyr)

f <- function(x, count, group) {
  data.frame(x, count, group) %>%
    add_count(x, wt = count) %>%
    group_by(x, group) %>%
    mutate(prop = count / n) %>%
    pull(prop)
}

ggplot(mtcars, aes(x = qsec)) +
  geom_histogram(
    aes(
      group = factor(am), fill = factor(am),
      y = f(..x.., ..count.., ..group..)
    ),
    col = NA, alpha = 0.35, binwidth = 1
  ) +
  stat_bin(
    aes(
      y = f(..x.., ..count.., ..group..),
      label = scales::percent(f(..x.., ..count.., ..group..), accuracy = 1), group = am
    ),
    binwidth = 1, geom = "text", position = position_stack(vjust = 0.5)
  )