0
votes

This is my first post, so go easy. Up until now (the past ~5 years?) I've been able to either tweak my R code the right way or find an answer on this or various other sites. Trust me when I say that I've looked for an answer! I have a working script to create the attached boxplot in basic R. http://i.stack.imgur.com/NaATo.jpg

This is fine, but I really just want to "jazz" it up in ggplot, for vain reasons. I've looked at the following questions and they are close, but not complete: Why does a boxplot in ggplot requires axis x and y? How do you draw a boxplot without specifying x axis?

My data is basically like "mtcars" if all the numerical variables were on the same scale. All I want to do is plot each variable on the same boxplot, like the basic R boxplot I made above. My y axis is the same continuous scale (0 to 1) for each box and the x axis simply labels each month plus a yearly average (think all the mtcars values the same on the y axis and the x axis is each vehicle model). Each box of my data represents 75 observations (kind of like if mtcars had 75 different vehicle models), again all the boxes are on the same scale. What am I missing?

1
ggplot requires data in long format. You need to convert your data to long format with, e.g., tidyr::gather or reshape2::melt. This will not demo well on mtcars since (a) mtcars doesn't have ID variables for the x axis (though we could convert the rownames to a column) and (b) it wouldn't look very nice with some discrete data and almost nothing on the same scale. But if you get your data in long format, your ggplot should be as easy as ggplot(long_data, aes(x = variable, y = value)) + geom_boxplot().Gregor Thomas
Basically, if mtcars was 75 vehicle models and each column variable was cylinders for 10 columns. Each column of cylinder was a different year. So it covered 1986 to 1995 year's worth of cylinders. In basic I would just write:chris
SORRY---, In basic I would just write something like: boxplot(mtcars$cyl1986, mtcars$cyl1987...) and so on. But I can't for the life of me do this simple boxplot in ggplot or qplot. I know it's because it's a more advanced package, but still.chris

1 Answers

3
votes

Though I don't think mtcars makes a great example for this, here it is:

First, we make the data (hopefully) more similar to yours by using a column instead of rownames.

mt = mtcars
mt$car = row.names(mtcars)

Then we reshape to long format:

mt_long = reshape2::melt(mt, id.vars = "car")

Then the plot is easy:

library(ggplot2)
ggplot(mt_long, aes(x = variable, y = value)) +
    geom_boxplot()

enter image description here

Using ggplot all but requires data in "long" format rather than "wide" format. If you want something to be mapped to a graphical dimension (x-axis, y-axis, color, shape, etc.), then it should be a column in your data. Luckily, it's usually quite easy to get data in the right format with reshape2::melt or tidyr::gather. I'd recommend reading the Tidy Data paper for more on this topic.