I know this gets asked a lot, but I'm having trouble making a 100% stacked bar plot in R. I know there are tons of pages out there explaining how, but nothing is working and I think the data I'm importing isn't configured correctly, so basically I want to know what I'm doing wrong in that respect. The data I'm using looks like the data in the attached picture. I'm able to create the exact chart I want in Excel, which I've also attached (the bar graph on the right; I couldn't attach more than one picture so they're just both in the same one), but for various reasons I need it to be in R. Is the way the data is written in Excel incorrect, and if so, how do I make it right?
0
votes
Can you add some code that you tried and where things went wrong? Right now it seems like a duplicate to me, possibly of, e.g., this question. But there may be subtle differences that we'll be able to see once you have added some code. Read here for ideas on how to make your question reproducible.
- aosmith
1 Answers
1
votes
In ggplot2
at least, you need to convert your data from "wide" to "long" format. Below, I use the tidyr::gather
function to "gather" the two data columns ("running" and "jumping") into a single "fraction" column, which you can then color by "activity".
library(magrittr) # For pipe (%>%)
dat <- tibble::tibble(
weeks = 1:15,
running = runif(15, 0, 1),
jumping = 1 - running
)
dat
#> # A tibble: 15 x 3
#> weeks running jumping
#> <int> <dbl> <dbl>
#> 1 1 0.675 0.325
#> 2 2 0.727 0.273
#> 3 3 0.430 0.570
#> 4 4 0.324 0.676
#> 5 5 0.809 0.191
#> 6 6 0.260 0.740
#> 7 7 0.433 0.567
#> 8 8 0.872 0.128
#> 9 9 0.0288 0.971
#> 10 10 0.903 0.0970
#> 11 11 0.295 0.705
#> 12 12 0.538 0.462
#> 13 13 0.342 0.658
#> 14 14 0.291 0.709
#> 15 15 0.877 0.123
library(ggplot2)
dat_long <- dat %>%
tidyr::gather(activity, fraction, running, jumping)
dat_long
#> # A tibble: 30 x 3
#> weeks activity fraction
#> <int> <chr> <dbl>
#> 1 1 running 0.675
#> 2 2 running 0.727
#> 3 3 running 0.430
#> 4 4 running 0.324
#> 5 5 running 0.809
#> 6 6 running 0.260
#> 7 7 running 0.433
#> 8 8 running 0.872
#> 9 9 running 0.0288
#> 10 10 running 0.903
#> # ... with 20 more rows
ggplot(dat_long) +
aes(x = factor(weeks), y = fraction, fill = activity) +
geom_col()
You can also do this in base R by converting to a "wide" matrix. (Note that I also use [, -1]
to drop the first column).
dat_tmat <- t(as.matrix(dat[, -1]))
dat_tmat
#> [,1] [,2] [,3] [,4] [,5] [,6]
#> running 0.5227949 0.5352537 0.5879579 0.2678927 0.93068128 0.2948861
#> jumping 0.4772051 0.4647463 0.4120421 0.7321073 0.06931872 0.7051139
#> [,7] [,8] [,9] [,10] [,11] [,12]
#> running 0.07729363 0.8925416 0.5503279 0.007479232 0.02991765 0.5832765
#> jumping 0.92270637 0.1074584 0.4496721 0.992520768 0.97008235 0.4167235
#> [,13] [,14] [,15]
#> running 0.8660134 0.1156794 0.3176998
#> jumping 0.1339866 0.8843206 0.6823002
barplot(dat_tmat, col = c("blue", "red"))
legend("topleft", c("running", "jumping"), col = c("blue", "red"), lwd = 5, bg = "white")