0
votes

I have successfully made a stacked barplot in R where the percentages add up to 100% for several different categories. The dataframe looks like this:

sujeito teste epentese vozeamento palavra tipo  ortografia
   <chr>   <chr> <chr>    <chr>      <chr>   <chr> <chr>     
 1 a       n     1        0          cats    ts    cs        
 2 b       l     1        1          ducks   ks    cs        
 3 c       l     1        1          cups    ps    cs        
 4 d       l     0        0          grapes  ps    ces       
 5 e       l     1        0          lakes   ks    ces       
 6 f       n     1        0          gates   ts    ces       
 7 g       n     0        0          books   ks    cs        
 8 h       n     1        0          cakes   ks    ces       
 9 a       n     1        1          kites   ts    ces       
10 b       n     1        0          boats   ts    cs     

Then I used ggplot and deplyr to make a stacked barplot displaying these percentages. I used this code:

dados%>%
group_by(sujeito, epentese)%>%
summarise(quantidade = n())%>%
mutate(frequencia = quantidade/sum(quantidade))%>%
ggplot(., aes(x = sujeito, y = frequencia, fill = epentese))+
geom_col()+
geom_col(position = position_fill(reverse=TRUE))+
scale_y_continuous(labels=scales::percent)+
labs(title = "Epenthesis rates by subject")+
theme(plot.title = element_text(hjust = 0.5))+
xlab("Subject")+ylab("Frequency")

My intention, though, is to make it as the graph of the right side of this picture:

enter image description here

I have tried different packages and also manipulating geom_text, but still no luck, especially due to the fact that I don't need the labels of both "fill categories", just the red one. I hope this isn't too redundant. Thanks in advance!

1

1 Answers

1
votes

To label only the red bars you could use e.g. an if_else in geom_text

Additionally I removed the redundant geom_col() and used some random example data.

library(ggplot2)
library(dplyr)

set.seed(42)

dados <- data.frame(
  sujeito = sample(letters[1:8], 100, replace = TRUE),
  epentese = sample(0:1, 100, replace = TRUE)
)

dados%>%
  group_by(sujeito, epentese)%>%
  summarise(quantidade = n())%>%
  mutate(frequencia = quantidade/sum(quantidade))%>%
  ggplot(aes(x = sujeito, y = frequencia, fill = factor(epentese))) +
  geom_col(position = position_stack(reverse=TRUE))+
  geom_text(aes(label = if_else(epentese == 0, scales::percent(frequencia, accuracy = 1), "")), vjust = 0, nudge_y = .01) +
  scale_y_continuous(labels=scales::percent)+
  labs(title = "Epenthesis rates by subject")+
  theme(plot.title = element_text(hjust = 0.5))+
  xlab("Subject")+ylab("Frequency")
#> `summarise()` regrouping output by 'sujeito' (override with `.groups` argument)