0
votes

Suppose we want to use R ggplot to produce a barplot showing percentages of some category variable across another category variable. Below is a small data frame in R and some code to get a nice pretty barplot of what I want.

MYDATA <- data.frame(Region = c(1, 1, 2, 2, 3, 3, 4, 4, 5, 5),
                     Sex    = c("M", "F", "M", "F", "M", "F", "M", "F", "M", "F"),
                     Count  = c(185, 130, 266, 201, 304, 283, 102, 60, 55, 51));

library(ggplot2);

ggplot(data = MYDATA, aes(x = Region, y = Count, fill = Sex)) + 
       geom_bar(position = "fill", stat = 'identity') + 
       scale_y_continuous(labels = scales::percent) +   
       scale_fill_manual(values = c("Maroon1", "RoyalBlue1")) +
       ggtitle("Figure 1: Sex breakdown by Region") +
       xlab("Region") + ylab("Percentage");

enter image description here

Now, I would like to supplement this bar plot by adding text that displays the raw count values inside the bars. For the females, I would like the counts to be displayed at the top of the graph, inside the pink bars. For the males, I would like the counts to be displayed at the bottom of the graph, inside the blue bars. I have tried using geom_text but I cannot figure out how to place the labels where I want them. Instead, my attempts have generally led to the count values being placed at their own values, which destroys the scale of the percentages in the plots. Can any of you learned people tell me how to add the raw count values as text in the bars without ruining the rest of the plot?

1

1 Answers

3
votes

Okay, so after posting this question I figured out a solution. Despite answering my own question (sorry guys) I will leave this post up in case anyone has the same problem they need to solve. Please feel free to add other solutions also.

The trick that eventually worked was to break the problem down to specify separate labels for each sex using two calls to geom_text with an ifelse command to enter labels only for one sex at a time, and manual y-positioning at the desire height in each statement. Here is the code and plot:

ggplot(data = MYDATA, aes(x = Region, y = Count, fill = Sex)) + 
       geom_bar(position = "fill", stat = 'identity') + 
       geom_text(aes(label = ifelse(Sex == "F", Count, "")), y = 0.95) +
       geom_text(aes(label = ifelse(Sex == "M", Count, "")), y = 0.05) +
       scale_y_continuous(labels = scales::percent) +   
       scale_fill_manual(values = c("Maroon1", "RoyalBlue1")) +
       ggtitle("Figure 1: Sex breakdown by Region") +
       xlab("Region") + ylab("Percentage");

enter image description here