2
votes

I´m about to analyze some data and stuck with the visualization and can´t get any progress right now.

So, here are dummy dataframes which are similar to the ones I use:

df1<-data.frame(replicate(15,sample(0:200,1500,rep=TRUE)))
df2<-data.frame(replicate(15,sample(0:200,1500,rep=TRUE)))
df3<-data.frame(replicate(36,sample(0:200,1500,rep=TRUE)))
df4<-data.frame(replicate(9,sample(0:200,1500,rep=TRUE)))

And the problem is the following:

I want to plot Boxplots of each Dataframe as a whole besides each other: So that the boxplots for df1, df2, df3 and df4 are besides each other in one plot. I don´t wanna have each station but the dataframe as a whole in this boxplot.

The boxplots for each dataframe works smoothly:

boxplot(df1, las=2)
boxplot(df2, las=2)
boxplot(df3, las=2)
boxplot(df4, las=2)

I then tried to combine them ggplot:

ggplot(data = NULL, aes(x, y))+
  geom_boxplot(data = df1, aes())+
  geom_boxplot(data = df2, aes())+
  geom_boxplot(data = df3, aes())+
  geom_boxplot(data = df4, aes())

But here i get a error message

Fehler in FUN(X[[i]], ...) : Objekt 'x' nicht gefunden

that something is wrong with the aes(), which is obvious, but i don´t have an idea what i can choose for x & y. Maybe i just think in a too complicated way, but yeah...there´s some link I´m missing.

So i hope everything is understandable and if information is missing then just ask and i add it!

1
Can you elaborate on what your desired result is? It seems that your plot returns multiple boxplots per data.frame, so how do you want to combine them?yrx1702
Also, your dataframes contains neither x nor y...coffeinjunky

1 Answers

3
votes

Maybe this is what you are looking for. To replicate the base R boxplots via ggplot2 you could

  1. Put your df's in a list
  2. Convert the df's to long format for which I use lapply and a helper function which
    • converts the df to long format using tidyr::pivot_longer
    • use forcats::fct_inorder to convert column with the variable names to a factor and preserves the right order as in the original df.
  3. Bind the long df's into one dataframe using e.g. dplyr::bind_rows where I add an id variable
  4. After the data wrangling it's an easy task to make boxplots via ggplot2 whereby I opted for facetting by df.
library(ggplot2)
library(tidyr)
library(dplyr)

df1<-data.frame(replicate(15,sample(0:200,1500,rep=TRUE)))
df2<-data.frame(replicate(15,sample(0:200,1500,rep=TRUE)))
df3<-data.frame(replicate(36,sample(0:200,1500,rep=TRUE)))
df4<-data.frame(replicate(9,sample(0:200,1500,rep=TRUE)))

df <- list(df1, df2, df3, df4)

to_long <- function(x) {
  pivot_longer(x, everything()) %>% 
    mutate(name = forcats::fct_inorder(name))
}
df <- lapply(df, to_long)
df <- bind_rows(df, .id = "id")

ggplot(df, aes(name, value)) +
  geom_boxplot() +
  facet_wrap(~id, scales = "free_x")

EDIT To get a boxplot for all columns of a dataframe and the boxplots side-by-side you can do

ggplot(df, aes(id, value)) +
  geom_boxplot()