0
votes

been stuck on this for a while.

For every row, I'm trying to count across the columns for all values greater than 0. But the caveat is that I need to specify the starting column to start counting across using each row's specific value in another column.

For example the table would look like this:

ID      |     StartWeek     |     1     |     2     |     3     |
123             2                 3           0           1
456             1                 1           0           1

Expected output would look like this:

ID      |     StartWeek     |     1     |     2     |     3     |   CountRow   |
123             2                 3           0           1           1
456             1                 1           0           1           2

I tried something like this:

df <- df %>%
mutate(CountRow = rowSums(.[StartWeek:5] > 0))

But it just gives me the entire column instead of the individual value of each row. I think I read upon a potential solution using groupby() but would there by another way of doing this just by accessing every rows certain value instead of calling the entire column.

2

2 Answers

1
votes

For each row we can count how many values after StartWeek are greater than 0. In dplyr we can use rowwise :

library(dplyr)

df %>%
  rowwise() %>%
  mutate(CountRow = { tmp <- c_across(`1`:`3`);
                      sum(tmp[StartWeek:length(tmp)] > 0)
                    })

#   ID    StartWeek   `1`   `2`   `3` CountRow
#  <int>     <int> <int> <int> <int>    <int>
#1   123         2     3     0     1        1
#2   456         1     1     0     1        2
1
votes

One way is to convert the dataframe to long format and filter out the unnecessary cells based on StartWeek before counting.

library(tidyverse)
df <- tribble(
  ~ID, ~StartWeek, ~"1", ~"2", ~"3",
  123L, 2L, 3L, 0L, 1L,
  456L, 1L, 1L, 0L, 1L
)
df %>% pivot_longer(cols=-c(ID, StartWeek)) %>%
  mutate(name=as.integer(name)) %>% filter(name>=StartWeek, value>0) %>%
  group_by(ID) %>% summarize(CountRow=n(), .groups="drop") %>%
  left_join(df, ., by="ID")
#> # A tibble: 2 x 6
#>      ID StartWeek   `1`   `2`   `3` CountRow
#>   <int>     <int> <int> <int> <int>    <int>
#> 1   123         2     3     0     1        1
#> 2   456         1     1     0     1        2

Created on 2021-03-11 by the reprex package (v1.0.0)