0
votes

I know this gets asked a lot, but I'm having trouble making a 100% stacked bar plot in R. I know there are tons of pages out there explaining how, but nothing is working and I think the data I'm importing isn't configured correctly, so basically I want to know what I'm doing wrong in that respect. The data I'm using looks like the data in the attached picture. I'm able to create the exact chart I want in Excel, which I've also attached (the bar graph on the right; I couldn't attach more than one picture so they're just both in the same one), but for various reasons I need it to be in R. Is the way the data is written in Excel incorrect, and if so, how do I make it right?

data being used on left, correct excel graph on right

1
Can you add some code that you tried and where things went wrong? Right now it seems like a duplicate to me, possibly of, e.g., this question. But there may be subtle differences that we'll be able to see once you have added some code. Read here for ideas on how to make your question reproducible. - aosmith

1 Answers

1
votes

In ggplot2 at least, you need to convert your data from "wide" to "long" format. Below, I use the tidyr::gather function to "gather" the two data columns ("running" and "jumping") into a single "fraction" column, which you can then color by "activity".

library(magrittr)                       # For pipe (%>%)

dat <- tibble::tibble(
  weeks = 1:15,
  running = runif(15, 0, 1),
  jumping = 1 - running
)

dat
#> # A tibble: 15 x 3
#>    weeks running jumping
#>    <int>   <dbl>   <dbl>
#>  1     1  0.675   0.325 
#>  2     2  0.727   0.273 
#>  3     3  0.430   0.570 
#>  4     4  0.324   0.676 
#>  5     5  0.809   0.191 
#>  6     6  0.260   0.740 
#>  7     7  0.433   0.567 
#>  8     8  0.872   0.128 
#>  9     9  0.0288  0.971 
#> 10    10  0.903   0.0970
#> 11    11  0.295   0.705 
#> 12    12  0.538   0.462 
#> 13    13  0.342   0.658 
#> 14    14  0.291   0.709 
#> 15    15  0.877   0.123

library(ggplot2)

dat_long <- dat %>%
  tidyr::gather(activity, fraction, running, jumping)

dat_long
#> # A tibble: 30 x 3
#>    weeks activity fraction
#>    <int> <chr>       <dbl>
#>  1     1 running    0.675 
#>  2     2 running    0.727 
#>  3     3 running    0.430 
#>  4     4 running    0.324 
#>  5     5 running    0.809 
#>  6     6 running    0.260 
#>  7     7 running    0.433 
#>  8     8 running    0.872 
#>  9     9 running    0.0288
#> 10    10 running    0.903 
#> # ... with 20 more rows

ggplot(dat_long) +
  aes(x = factor(weeks), y = fraction, fill = activity) +
  geom_col()

You can also do this in base R by converting to a "wide" matrix. (Note that I also use [, -1] to drop the first column).

dat_tmat <- t(as.matrix(dat[, -1]))
dat_tmat
#>              [,1]      [,2]      [,3]      [,4]       [,5]      [,6]
#> running 0.5227949 0.5352537 0.5879579 0.2678927 0.93068128 0.2948861
#> jumping 0.4772051 0.4647463 0.4120421 0.7321073 0.06931872 0.7051139
#>               [,7]      [,8]      [,9]       [,10]      [,11]     [,12]
#> running 0.07729363 0.8925416 0.5503279 0.007479232 0.02991765 0.5832765
#> jumping 0.92270637 0.1074584 0.4496721 0.992520768 0.97008235 0.4167235
#>             [,13]     [,14]     [,15]
#> running 0.8660134 0.1156794 0.3176998
#> jumping 0.1339866 0.8843206 0.6823002

barplot(dat_tmat, col = c("blue", "red"))
legend("topleft", c("running", "jumping"), col = c("blue", "red"), lwd = 5, bg = "white")