2
votes

I have a dataframe df with three columns: TASK, CONDITION, and SCORE. I want to represent the data:

  1. as barplots (I'm using geom_col)
  2. with a separate plot for each TASK (I'm using facet_wrap(~TASK))
  3. with a separate bar for each CONDITION (I'm using ggplot(df, aes(x=CONDITION)))

Additionally, the expected behavior is that, if the data of a given bar sum up to a given percentage, then that bar should be the same color as other bars that reach the same percentage. Unfortunately, I can't get that to work.

In the minimal example below, 3 bars are reaching 100%, therefore I expect them to all be blue as per the instruction high="blue" but this is not what is happening.

Input =("
TASK CONDITION SCORE
GAU   0         0.25
GAU   0         0.25
GAU   0         0.25
GAU   0         0.25
GAU   1         0.2
GAU   1         0.2
GAU   1         0.2
GAU   1         0.2
GAU   1         0.2
PLN   0         0.3333
PLN   0         0.3333
PLN   0         0
PLN   1         0.5
PLN   1         0.5
        ")
df <- read.table(textConnection(Input),
                 header=TRUE)
df$CONDITION <- factor(df$CONDITION)

library(ggplot2)
ggplot(df, aes(x=CONDITION, y=SCORE, fill=SCORE)) +
  geom_col() +
  ggtitle("Performance") +
  ylab("Total") +
  scale_y_continuous(labels = scales::percent) +
  facet_wrap(~TASK) +
  scale_fill_gradient(low="red", high="blue")
1

1 Answers

1
votes

What'a really going on is a bit hidden by the plot. If we put a border on the bars and change the first value, maybe it will make it more clear

df2 <- df
df2[1, "SCORE"] <- .5
ggplot(df2, aes(x=CONDITION, y=SCORE, fill=SCORE)) +
  geom_col(color="black") +
  ggtitle("Performance") +
  ylab("Total") +
  scale_y_continuous(labels = scales::percent) +
  facet_wrap(~TASK) +
  scale_fill_gradient(low="red", high="blue")

enter image description here

It's not coloring by total height of the plot, it's coloring by each observation. Notice how your color scale was only going up to .5. If you just want to use ggplot for this, you can use a summary stat for the geom_bar to do the summation for you. It would look like this

ggplot(df, aes(x=CONDITION, y=SCORE, fill=..y..)) +
  geom_bar(stat="summary", fun.y="sum") +
  ggtitle("Performance") +
  ylab("Total") +
  scale_y_continuous(labels = scales::percent) +
  facet_wrap(~TASK) +
  scale_fill_gradient(low="red", high="blue")

enter image description here