I'm trying to sort a dataframe by countries and legislative elections - in one step that is replicable for multiple different political party families.
What I did so far was to sort the main dataset into party family (parfam == '10'), "recent" elections (date > '201000'), and excluding countries with no relevant data (! country %in% nodata, nodata being a list of values I'd already created):
eco <- filter(CMPdataset, parfam == '10' & date > '201000' & ! country %in% nodata)
Due to some countries having multiple elections coded into the overarching dataset CMPdataset in the time-period after 2010, I went through the data manually and eliminated all the unnecessary ones by hand using:
eco <- eco[-c(1,8,10,11,13,14,18,20,21,22,23,27,28,31,32,34,35,37), ]
As you can see, this can be quite tedious for larger dataframes, though. So I thought I'd combine the formulae I know and came up with the following (edate is a variable with the specific election date in the format YYYY-MM-DD, I made a list of all the specific elections I include under the name included_elections):
eco2 <- filter(CMPdataset, parfam == '10' & ! country %in% nodata & edate %in% included_elections)
However, this yields no results, and I have no clue why! I could just stick to doing it all by hand, but it's quite tedious and not easily replicable, which is why I'd really prefer a solution like this. Any help would be greatly appreciated!
dput(head(CMPdataset$edate))anddput(head(included_elections))? The dates might be encoded differently. - Frank> dput(head(CMPdataset$edate)) structure(c(-9237, -9237, -9237, -9237, -9237, -7774), class = "Date") > dput(head(included_elections2)) c("2014-09-14", "2013-09-09", "2011-09-15", "2011-04-17", "2013-04-27", "2010-06-13")- luca_sincluded_electionsto date format,included_elections <- as.Date(included_elections). But @iod's approach is a better long-term solution. - Frank