0
votes

I'm struggling with the filter (dplyr) function on a tidy dataframe:

data1<-data.frame("Time"=c(0,0,0,0,1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4,5,5,5,5),
                  "Variable"=rep(c("a","b","c","d"),6),
                  "Value"=c(0,1,0,0,1,1,1,1,1,3,2,3,10,1,3,7,2,1,4,2,3,1,5,13))

What I want to do is to filter the time when variable "a" is equal to 2 and when variable "a" is as it max. For first case mi code is:

data1<-data1%>%
  group_by(Time)%>%
  filter(any(Variable=="a" & Value==2))

and works fine and gives me:

Time Variable Value
2    a        2 
2    b        1
2    c        4
2    d        2

Don't now how could be for a=max(a), I tried with:

data1<-data1%>%
  group_by(Time)%>%
  filter(any(Variable=="a" & Value==max(Value)))

but doesn't work (becaus max is calculated on all column Variable) I think I need something like Value=max(Value)[Variable$a]. The filtered must act this way:

Time Variable Value
3    a        10 
3    b        1
3    c        3
3    d        7

I prefer a solution with dplyr. Can anyone give me a general rule for filtering on tidy df with multiple criteria?

3

3 Answers

1
votes

Here's a dplyr way:

library(dplyr)
data1%>%
  filter(Time == Time[Value == max(Value[Variable == "a"])])

And a data.table way

library(data.table)
setDT(data1)
data1[Time == Time[Value == max(Value[Variable == "a"])]]
1
votes

additional option

data1 %>% 
  filter(Variable == "a") %>% 
  filter(Value == max(Value, na.rm = T)) %>% 
  select(Time) %>% 
  left_join(., data1, by = "Time")
0
votes

Based on the edited criteria this should provide the desired results.

data1 <- data1 %>%
         group_by(Time) %>%
         filter(any(Variable=="a" & 
                    Value==max(data1$Value[data1$Variable == 'a'])))