I'm working with American Community Survey (ACS) 1-year estimates for a specific location over several years. For example, I'm trying to plot how the proportion of men and women riding a bike to work changes over time. From the ACS, I get estimates and standard error, which I can then use to calculate the upper and lower bounds of the estimates.
So the simplified data structure in wide format is like this:
| Year | EstimateM | MaxM | MinM | EstimateF | MaxF | MinF |
|------|-----------|------|------|-----------|------|------|
| 2005 | 3.0 | 3.5 | 2.5 | 2.0 | 2.3 | 1.7 |
| 2006 | 3.1 | 3.5 | 2.6 | 2.0 | 2.3 | 1.7 |
| 2007 | 5.0 | 4.2 | 5.8 | 2.5 | 3.0 | 2.0 |
| ... | ... | ... | ... | ... | ... | ... |
If I only wanted to plot the estimates, I'd melt
the data with only the two Estimate
variables as measure.vars
GenderModeCombined_long <- melt(GenderModeCombined,
id = "Year",
measure.vars = c("EstimateM",
"EstimateF")
The long data can then be easily plotted with ggplot2
ggplot(data=GenderModeCombined_long,
aes(x=year, y=value, colour=variable)) +
geom_point() +
geom_line()
This produces a graph like so
(sorry, don't have enough rep to post images)
Where I'm stuck is how to add error bars to the two estimate graphs. I could add them as measure vars
to the melted dataset, but then how do I tell ggplot what should be plotted as values and what as error bars? Do I have to create a separate data frame with just the min/max data and then load that separately?
geom_errorbar(data = errordataMmax, aes(ymax = ??, ymin = ??))
I have the feeling that I'm somehow approaching this the wrong way and/or have my data set up the wrong way.