1
votes

I am in trouble of creating plots in R. If I have data like

enter image description here

I want to create:

enter image description here

with x-axis be Sepal.length, Sepal.Width, Petal.Width, Petal.Length ,y-axis be different species and height be the values. And also fill each bar plot with different color according to y-axis.

Thank you!

So far, I have tried:

iris_mean <- aggregate(iris[,1:4], by=list(Species=iris$Species), FUN=mean) 
library(reshape2)
df_mean <- melt(iris_mean, id.vars=c("Species"), variable.name = "Samples", 
  value.name="Values")

ggplot(df_mean,aes(Samples,Values))+
geom_bar(aes(fill=Species),stat="identity")+
  facet_grid(Species~.,scale='free',space='free')+theme(panel.margin = unit(0.1, "lines"))


ggplot(df_mean,aes(x=Samples,y=Species,height =Values))+
  geom_density_ridges2(aes(fill=Species),stat='identity',
                       scale=1.5,
                       alpha=0.1,
                       lty = 1.1)
2
What code have you tried so far? You should post it, otherwise it'll be hard to get an answer.RLave
I have tried:ggplot(df,aes(Samples,Values))+ geom_bar(aes(fill=Species),stat="identity")+ facet_grid(Species~.,scale='free',space='free')+theme(panel.margin = unit(0.1, "lines")) and \n ggplot(df_mean,aes(x=Samples,y=Species,height =Values))+ geom_density_ridges2(aes(fill=Species),stat='identity', scale=1.5, alpha=0.1, lty = 1.1)Cecily Mag
You see how in your ridge plot you just have a single line for each? That's because the ridge plot is meant for displaying a distribution of values, like a histogram does, but you're only giving it a single mean for each. Try just using the original values, not the averagescamille

2 Answers

3
votes

Your facetted plot is on the right track. Like I said in my comment, you're trying to display a distribution of values, not the means of values. You could set breaks manually and calculate counts to show in a geom_bar, but that would easily get very complicated, especially since the different types of measures are on different scales. I'd recommend just sticking with a simple histogram. I used gather rather than melt to make long data—that's just preference.

Beyond what you've got, it's a matter of 1. working with distributions, and 2. being clever with the theme. If you move the facet labels, rotate the left-side strips, take out the strip background, and remove vertical spacing between panels, you've essentially got a ridge plot. I'm not very familiar with ggridges, but I'd guess it does something similar. From here, you can adjust how you see fit.

library(tidyverse)

iris_long <- as_tibble(iris) %>%
  gather(key = measure, value = value, -Species)

ggplot(iris_long, aes(x = value, fill = Species)) +
  # geom_density_ridges() +
  geom_histogram(show.legend = F) +
  scale_y_continuous(breaks = NULL) +
  labs(x = "Measure", y = "Species") +
  facet_grid(Species ~ measure, scales = "free", switch = "both") +
  theme(strip.background = element_blank(), strip.text.y = element_text(angle = 180), 
        strip.placement = "outside", panel.spacing.y = unit(0, "cm"))
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Created on 2018-07-19 by the reprex package (v0.2.0).

1
votes

FYI, better to post you're data than put in a screen shot and you should also post the code you've tried so far.

What you're looking for is facet_grid:

library(tidyverse)

iris_summarized <- iris %>%
 group_by(Species, Sepal.Length) %>%
 summarize(total = n())

ggplot(iris_summarized, aes(x = Sepal.Length, y = total, fill = Species)) + # the fill argument sets the color for the bars
 geom_col() + # use geom_col instead of geom_bar if you are explicitly referencing counts in your data set
 facet_grid(Species ~ ., switch = "y") + # the switch = "y" argument moves the species name to the left side of the graph
 theme(strip.placement = "outside", # this moves the title of each facet to the left of the axis
       strip.background = element_blank()) # this makes the title of each facet not have a background