0
votes

I have this df:

     ID values Pop1
1 PDAC1 611648 Nafr
2 PDAC1 322513 Nafr
3 PDAC2 381089 Nafr
4 PDAC2  16941 Nafr
5 PDAC3  21454 Jud
6 PDAC3 658802 Jud

I want to make two histograms using facet_wrap on the "Pop1" column:

ggplot(all.samples2) +
  aes(x = values, fill = Pop1, colour = Pop1, after_stat(density)) +
  geom_histogram(bins = 30L) +
  theme_minimal() +
  facet_wrap(vars(Pop1)) +
  theme_bw() +
  theme(aspect.ratio=1)

But instead of using the values, I want to use the percentage within each population. So e.g. for Pop1 = Nafr, my histogram would show 25% of data in bin 0 to 300000, 50% of data in bin 300000 to 600000 and 25% of data in bin 600001 to 900000.

How can I do that?

Thanks

1

1 Answers

1
votes

you can use geom_bar instead of geom_histogram and provide y = ..prop..:

  [![ggplot(df) +
  aes(x = values, fill = Pop1, colour = Pop1) +
  theme_minimal() +
  facet_wrap(vars(Pop1)) +
  geom_bar(aes(y = ..prop..)) + 
  theme_bw() +
  theme(aspect.ratio = 1) + 
  labs(y = "") + 
  scale_y_continuous(labels = scales::percent)][1]][1]

enter image description here