I am currently trying to apply the summarise function in order to isolate the relevant observations from a large data set. A simple reproducible example is given here:
df <- data.frame(c(1,1,1,2,2,2,3,3,3), as.logical(c(TRUE,FALSE,TRUE,TRUE,TRUE,TRUE,FALSE,TRUE,FALSE)),
as.numeric(c(0,5,0,0,0,0,7,0,7)))
colnames(df) <- c("ID", "Status", "Price")
ID Status Price
1 1 TRUE 0
2 1 FALSE 5
3 1 TRUE 0
4 2 TRUE 0
5 2 TRUE 0
6 2 TRUE 0
7 3 FALSE 7
8 3 TRUE 0
9 3 FALSE 7
I would like to sort the table by observation and get the status TRUE only if all three observations are TRUE (figured out) and then want to get the price corresponding to the status (i.e. 5 for observation 1 as FALSE, 0 for observation 2 as TRUE and 7 for observation 3 as FALSE).
From Summarize with conditions in dplyr I have figured out that I can - just as usually - specify the conditions in square brackets. My code so far thus looks like this:
library(dplyr)
result <- df %>%
group_by(ID) %>%
summarize(Status = all(Status), Test = ifelse(all(Status) == TRUE,
first(Price[Status == TRUE]), first(Price[Status == FALSE])))
# This is what I get:
# A tibble: 3 x 3
ID Status Test
<dbl> <lgl> <dbl>
1 1. FALSE 0.
2 2. TRUE 0.
3 3. FALSE 7.
But as you can see, for ID = 1 it gives an incorrect price. I have been trying this forever, so I would appreciate any hint as to where I have been going wrong.