Side-by-side bar chart with columns proportional by group (relative frequency bar chart)

Question

The dataset

gender <- c('Male', 'Male', 'Male', 'Female', 'Female', 'Female', 'Male', 'Male', 'Male', 'Female', 'Female', 'Female', 'Female', 'Female', 'Male', 'Female', 'Female', 'Male', 'Female', 'Female')
answer <- c('Yes', 'No', 'Yes', 'Yes', 'No', 'No', 'No', 'No', 'No', 'No', 'No', 'Yes', 'No', 'No', 'Yes', 'Yes', 'Yes', 'Yes', 'No', 'Yes')
df <- data.frame(gender, answer)

is biased towards females:

df %>% ggplot(aes(gender, fill = gender)) + geom_bar()

My task is to build a graph that makes it easy to figure out which of the two genders is more likely to say 'Yes'.

But, given the bias, I cannot just do

df %>% ggplot(aes(x = answer, fill = gender)) + geom_bar(position = 'dodge')

or even

df %>% ggplot(aes(x = answer, y = ..count../sum(..count..), fill = gender)) +
geom_bar(position = 'dodge')

To alleviate the bias I need to divide each of the counts by the total number of males or females respectively so that the 'Female' bars add up to 1 as well as the 'Male' ones. Like so:

df.total <- df %>% count(gender)
male.total <- (df.total %>% filter(gender == 'Male'))$n
female.total <- (df.total %>% filter(gender == 'Female'))$n

df %>% count(answer, gender) %>% 
mutate(freq = n/if_else(gender == 'Male', male.total, female.total)) %>% 
ggplot(aes(x = answer, y = freq, fill = gender)) + 
geom_bar(stat="identity", position = 'dodge')

Which draws a completely different picture.

Questions:

Is there a way to simplify the former piece of code using only dplyr and ggplot2?
Are there any other libraries that can do the trick better?
Does the above type of chart have a conventional name?

Thanks.

Jake Kaupp Jake Kaupp · Accepted Answer · 2018-01-25T01:17:42

Question 1:

df %>%  
  count(gender, answer) %>% 
  group_by(gender) %>% 
  mutate(freq = n/sum(n)) %>% 
  ggplot(aes(x = answer, y = freq, fill = gender)) + 
  geom_bar(stat="identity", position = 'dodge')

Question 2:

You can probably do it in fewer lines with other packages.

Question 3:

Relative frequency bar graph.

Side-by-side bar chart with columns proportional by group (relative frequency bar chart)

3 Answers