So I've got an ID an event, and I want to use group_by (or some similar function) to do a conditional cumulative sum. Here's the data:
ID Event
42 NA
42 1
42 2
42 NA
42 1
43 NA
43 1
43 2
43 2
and what I want to do is have two new columns that count the 1s and the 2s, cumulatively, without collapsing any of the data:
ID Event count_1s count_2s
42 NA 0 0
42 1 1 0
42 2 1 1
42 NA 1 1
42 1 2 1
43 NA 0 0
43 1 1 0
43 2 1 1
43 2 1 2
So I understand how to use group_by to sum them all up by ID, something like this:
t <- data %>% group_by(ID, Event) %>% summarize(count_1s = sum(!is.na(Event == 1)))
But what I can't understand is how to get a running conditional sum -- seems like group_by will collapse my data and I need to maintain every row.
EDIT: so the accepted answer works perfectly, but just one more question. What if the values are different, by event? For example:
ID Event count_a count_b
42 NA 0 0
42 1 1 0
42 2 1 1
42 NA 1 1
42 1 2 1
43 NA 0 0
43 3 1 0
43 4 1 1
43 4 1 2
There will always only be two Event values per ID, (doesn't matter which is a and which is b) and I want them to reset each time.