0
votes

I'm struggling to graph a boxplot from a dataframe. I have a dataframe df which contains 3 columns: df$A , df$B, df$C. I want to have df$C in the X-axis and plot df$A and df$B as boxplot on the Y-axis in one graph. So, for each value of x-axis, it should be two values df$A and df$B. I want to have the boxes are aligned for each value of X-axis (df$C). I have tried to use interaction to combine (A,B) in column then graph it like this:

df$AandB <- interaction(df$A, df$fB)

ggplot(aes(y = AandB, x = df$C), data = df) + geom_boxplot()

BUT it didn’t work. It showed me only horizontal lines-sorry I couldn't upload the image as I'm new user.

I found some suggestions to use fill or colour but it didn’t work.

Any suggestions?

sample of my df:

  A         B         C  
 200.12    30.11       28.75 
 100.75    26.17       29.98        
 27.33      25.58      34.98 
 25.19      22.6       35.56 
 40.03      21.02      37.51 
 20.3       18.31      44.75   
1
Please provide some additional information on your data. What is the output of str(df)? Could you provide the output of dput(df)?Sven Hohenstein
I have included the output df !!!SimpleNEasy
In your example data, each unique value of C has exactly one value for A and one value B. What kind of boxes are you looking for?Sven Hohenstein
I want to plot for each unique value of C a box for A and B aligned to each other to show the difference. The boxes should include the value of the mean.SimpleNEasy

1 Answers

1
votes

The data:

df <- read.table(text="A         B         C  
200.12    30.11       28.75 
100.75    26.17       29.98        
27.33      25.58      34.98 
25.19      22.6       35.56 
40.03      21.02      37.51 
20.3       18.31      44.75", header = TRUE)

First, the data needs to be arranged in the long format. The values of A and B are combined in one column with the reshape2 package.

library(reshape2)
df_l <- melt(df, id.vars = "C")

Now, the plot can be created:

library(ggplot2)
ggplot(df_l, aes(x = C, y = value)) + 
  stat_summary(aes(group = C),
               fun.y = mean, fun.ymin = min, fun.ymax = max, geom = "crossbar")

The crossbar denotes both the range and the mean of the data. enter image description here