1
votes

I have a dataset where I have several conditions, and I want to create a stacked bar graph showing the frequency of errors occurring in each condition. (so the number of cases in each condition where 1 error occurred, 2 errors occurred 3 errors occurred... etc etc.)

In theory, I understand the principle of creating bar graphs with ggplot2. However, the problem I am having is that the 'frequency' count is not an actual variable in the data frame (as it requires counting the number of cases). I'm not sure how to add it in to the gpplot2 framework (potentially using the 'stat' function, but I'm not so certain how this works).

I checked out the following similar questions:

How to barplot frequencies with ggplot2?

R stacked % frequency histogram with percentage of aggregated data based on

Display frequency instead of count with geom_bar() in ggplot

How to label stacked histogram in ggplot

But none of them really provide the answer I'm looking for (i.e., how to count the number of cases for each 'error' and include that into the ggplot2 code.

Below are some of my attempts with example data

library(tidyverse)

condition <- c("condition 1", "condition 2", "condition 3", "condition 1", "condition 2", "condition 3", "condition 1", "condition 2", "condition 3", "condition 1", "condition 2", "condition 3", "condition 1", "condition 2", "condition 3")
number_of_errors <- c(1,2,3,3,2,1,4,4,5,4,5,1,2,2,3)

df <- data.frame(condition, number_of_errors)
df

df_melt <-melt(df) #This creates a data frame with 3 columns, 'condition', 'variable' and 'value' where 'variable' just says 'number_of_errors' for each row


# Attempt 1 - (Error: stat_bin() can only have an x or y aesthetic.)
ggplot(df_melt, aes(x=condition, y = variable, fill=value)) + 
  geom_bar(stat="bin", position="stack") +
  xlab("Condition") + 
  ylab("Frequency of Errors")


# Attempt 2 (produces a graph, but not a stacked one, just the total number of cases in each condition)
ggplot(df_melt, aes(x = condition, fill = value, label = value)) +
  geom_bar(col="black") +
  stat_count(position="stack")


# Attempt 3 (also produces a graph, but again not a stacked one - I think it is the sum of the number of errors?)
ggplot(df_melt,aes(factor(condition),y=as.numeric(value))) + 
  geom_bar(stat = "identity", position = "stack")

I am certain I must be missing something obvious about how to create values for the counts, but I'm not sure what. Any guidance is appreciated :)

2
I'm not quite clear on what the desired output is supposed to look like. What are the bars meant to be labeled and and what should the colors represent? Maybe you just want ggplot(df_melt, aes(x = condition, fill = factor(value))) + geom_bar(position = "stack") + labs(fill="Number of Errors")?MrFlick
@MrFlick This also works! and as far as I can tell generates exactly the same output as Chuck P's solution :) To clarify for anyone else reading, I was looking for the colours to represent the number of errors and the bars as the count for the number of cases in each error number categorybecbot

2 Answers

2
votes

Maybe you are looking for this style of plot. You need to group by condition and then assign a value so that the bars can be designed. Here the code:

library(tidyverse)
#Data
condition <- c("condition 1", "condition 2", "condition 3", "condition 1", "condition 2", "condition 3", "condition 1", "condition 2", "condition 3", "condition 1", "condition 2", "condition 3", "condition 1", "condition 2", "condition 3")
number_of_errors <- c(1,2,3,3,2,1,4,4,5,4,5,1,2,2,3)
df <- data.frame(condition, number_of_errors)
#Code
df %>% group_by(condition) %>% mutate(Number=factor(1:n())) %>%
  ggplot(aes(x=condition,y=number_of_errors,fill=Number,group=Number))+
  geom_bar(stat = 'identity')+
  geom_text(aes(label=number_of_errors),position = position_stack(0.5))+
  theme(legend.position = 'none')

Output:

enter image description here

1
votes

I think the key for you might be to convert number_of_errors to a factor and make geom_bar(stat="count") you may also beenfit from this tutorial

library(ggplot2)
df$number_of_errors <- factor(df$number_of_errors)

ggplot(df, aes(x=condition, fill = number_of_errors)) +
  geom_bar(stat="count")