I have a large dataset (> 3k rows) that I want to filter based on geographic location and date. The location filtering works fine but I get the following error message when using logical operators on Dates with filter (dplyr):
Error: level sets of factors are different
my current codes is as below:
head(master.data)
State.Name County.Code Latitude Longitude Arithmetic.Mean Date.Local
1 Alabama 3 30.49748 -87.88026 8.0 2014-01-02
2 Alabama 3 30.49748 -87.88026 7.0 2014-01-05
3 Alabama 3 30.49748 -87.88026 7.0 2014-01-08
4 Alabama 3 30.49748 -87.88026 3.6 2014-01-11
5 Alabama 3 30.49748 -87.88026 5.2 2014-01-14
6 Alabama 3 30.49748 -87.88026 4.4 2014-01-17
master.data$Date.Local <- as.Date(master.data$Date.Local, format = "%Y-%m-%d")
site.info <- data.frame("Alabama", 3, 30, 90, "28/12/2015", "13/07/2016")
names(site.info) <- c("State.Name", "County.Code", "Latitude", "Longitude",
"Date.Start", "Date.End")
site.info$Date.Start <- as.Date(site.info$Date.Start, format = "%d/%m/%Y")
site.info$Date.End <- as.Date(site.info$Date.End, format = "%d/%m/%Y")
reduced.data <- filter(master.data, State.Name == site.info$State.Name,
Date.Local >= site.info$Date.Start
& Date.Local <= site.info$Date.End)
Both site.info and master.data have the dates formatted using as.Date. The input format is different because they are imported from external sources.
I am able to perform logical operations on the two with the expected results outside of filter. Not sure why this is the case. Using %in% yields the same results
Date.Local %in% c(site.info$Date.Start, site.info$Date.End)
How can I get this to work?
State.Name == site.info$State.Namemay be troubling if there are more number of elements in 'site.info' and isfactorclass. You may try ajoin. This can be done more easily withdata.tablei.e.setDT(master.data)[site.info, on = .(State.Name, Date.Local >= Date.Start, Date.Local <=Date.End)]- akrunsite.infohas no more than 12 rows and the dates are formatted usingas.Dateso I'm curious as to why it doesn't work. - Gautamdput(head(master.data))in you question we can know for certain. Unless you want this column as a factor, don't let it become one. EG usestringsAsFactors = FALSEinread.table- Richard TelfordState.Nameis indeed a factor. The output is too long to print here. Others areint' ornum` except forDate.Localwhich is adate- Gautam