5
votes

I have a data frame as this, and want the output as shown desired at the end. Instead, I get the NA output in the middle. Is there any way to do what I want using dplyr?

x <- c(1234, 1234, 1234, 5678, 5678)
y <- c(95138, 30004, 90038, 01294, 15914)
z <- c('2014-01-20', '2014-10-30', '2015-04-12', '2010-2-28', '2015-01-01')
df <- data.frame(x, y, z)
df$z <- as.Date(df$z)
df %>% group_by(x) %>% summarise(y = y[max(z)])

What I get:
     x  y
1 1234 NA
2 5678 NA

Desired Output:
     x     y 
1 1234 90038
2 5678 15914
2

2 Answers

7
votes

You can try which.max to get the numeric index of max values that can be used for subsetting the 'y' element. Using max just gives the maximum values of z.

df %>%
    group_by(x) %>%
    summarise(y= y[which.max(z)])
#     x     y
#1 1234 90038
#2 5678 15914
4
votes

Use filter and max in dplyr.

df%>%group_by(x)%>%filter(z==max(z))