I am looking for solutions using data.table ― I have a data.table with the following columns:
data <- data.frame(GROUP=c(3,3,4,4,5,6),
YEAR=c(1979,1985,1999,2011,2012,1994),
NAME=c("S","A","J","L","G","A"))
data <- as.data.table(data)
Data.table:
GROUP YEAR NAME
3 1979 Smith
3 1985 Anderson
4 1999 James
4 2011 Liam
5 2012 George
6 1994 Adams
For each group we want to select one row using the following rule:
- If there is a year > 2000, select the row with minimum year above 2000.
- If there not a year > 2000, select the row with the maximum year.
Desired output:
GROUP YEAR NAME
3 1985 Anderson
4 2011 Liam
5 2012 George
6 1994 Adams
Thanks! I have been struggling with this for a while.