0
votes

I have a data frame that can be generated as follows:

DD <- data.frame(group = c(rep("A", 5), rep("B", 6)), Feat1 = rnorm(11), feat2 = rnorm(11,3,5), feat3 = rnorm(11), feat4 = rnorm(11,2,3))

I would like to plot the distribution of each column feature for 2 factors(A & B) of the column called group. i.e. I would like to have 4 plots for columns feat1, feat2,... where each one has 2 distribution plots of group A and B. I would like to have 4 plot in one frame at once.

Do you have any idea how can I do it using ggplot?

1
Please post what you've tried so far, and where you're getting stuck. That will make it easier to help you learn. It's better to think of SO as a co-learning site, rather than a place to come to have others solve your problems from scratch. - andrew_reece

1 Answers

0
votes

I'm not 100% sure what you're trying to achieve, but I think that pivoting your data should get you on the right track. If we move all the feature values into a single column it's much easier to split up the plot into facets.

library(ggplot2)
library(tidyr)

DD2 <- DD %>% 
  pivot_longer(-group, names_to = "feature")

#   group feature  value
#   <fct> <chr>    <dbl>
# 1 A     Feat1    2.17 
# 2 A     feat2   -2.69 
# 3 A     feat3    3.07 
# 4 A     feat4    0.848
# 5 A     Feat1   -2.00 
# 6 A     feat2   -4.96 
# 7 A     feat3    0.798
# 8 A     feat4   -2.96 
# 9 A     Feat1   -1.65 
#10 A     feat2    3.45
# ... with 34 more rows

We can now easily facet the plot by the feature column:

DD2 %>%
  ggplot(aes(x = group, y = value)) +
  geom_boxplot() + # Also works with geom_violin(), geom_jitter() etc.
  facet_grid(~feature)

enter image description here