I have a data from of dates and values. I am trying to get the fourth highest value per year using dplyr and order or multiple aggregate statements. I want the date that the fourth highest value occurred on as well as the value in a data frame for all years.
Here is my script:
timeozone <- import(i, date="DATES", date.format = "%Y-%m-%d %H", header=TRUE, na.strings="NA")
colnames(timeozone) <- c("column","date", "O3")
timeozone %>%
mutate(month = format(date, "%m"), day = format(date, "%d"), year = format(date, "%Y")) %>%
group_by(month, day, year) %>%
summarise(fourth = O3[order(O3, decreasing = TRUE)[4] ])
I am not sure what is wrong with what I've got above. Any help would be appreciated.
Data:
Dates Values
11/12/2000 14
11/13/2000 16
11/14/2000 17
11/15/2000 21
11/13/2001 31
11/14/2001 21
11/15/2001 62
11/16/2001 14
dplyr::nth()
. – tchakravartymonth
andday
? I thought you wanted the fourth largest value per year. From the sample data you posted, there is no03
column (is thatvalue
?) and it appears there is only one value per day -- there can be no fourth highest if that is the case. Try grouping by onlyyear
– Mark Peterson