My problem is that I want to reorder the factors in my ggplot output, produced using geom_bar(position = "fill"
), so that the highest proportion of the positive class is closest to the y-axis. I have managed to find a working solution, but from my research it seems like there is a more efficient solution lurking, although I cannot seem to find it.
I have read through the question Order Bars in ggplot2 bar graph but I can't seem to find a solution for ordering by proportions, that is, by values that are not explicitly present in the data frame, but are a summary stat.
I have had a look at the Modifying Factor Order section of the book, R for Data Science and a solution is proposed whereby a summary data frame is produced with a "prop" column and a line graph created from these values using fct_reorder2()
. However, I can't seem to apply similar logic to a "filled" bar graph.
The solution I finally stumbled upon came from this source #267 REORDER A VARIABLE IN GGPLOT2, whereby you just set the new factor levels using mutate()
. However, instead of just defining the order myself, I created a data frame that ordered the factor by the proportion of positive class.
What I'd like to know is if there is a more efficient way to do this, perhaps in one long pipe operation?
Here's a reproducible example:
library(ggplot2)
library(dplyr)
variable <- c(rep("alpha", 4),
rep("beta", 4),
rep("gamma", 4),
rep("delta", 4))
class <- c(rep("1", 4),
"1", "1", "0", "0",
rep("0", 3), "1",
rep("1", 3), "0")
dframe <- data.frame(variable, class)
plot_order <- dframe %>%
count(variable, class) %>%
group_by(variable) %>%
mutate(prop = prop.table(n)) %>%
filter(class == "1") %>%
arrange(prop)
lvls <- as.character(plot_order$variable)
dframe %>%
mutate(variable = factor(variable, levels = lvls)) %>%
ggplot(aes(x = variable, fill = class)) +
geom_bar(position ="fill") +
labs(y = "Proportion")
Here's the output of plot_order
:
# A tibble: 4 x 4
# Groups: variable [4]
variable class n prop
<fct> <fct> <int> <dbl>
1 alpha 1 4 1
2 delta 1 3 0.75
3 beta 1 2 0.5
4 gamma 1 1 0.25
And the outcome:
Bar graph with ordered factors based on position "fill"
Thanks in advance.
geom_histogram()
you can access..count..
or..density..
. – red.panda