To come up with a column of counts for each level (or combination of levels) for categorical variables is data.table syntax can be handled with something like:
#setting up the data so it's pasteable
df <- data.table(var1 = c('dog','cat','dog','cat','dog','dog','dog'),
var2 = c(1,5,90,95,91,110,8),
var3 = c('lamp','lamp','lamp','table','table','table','table'))
#adding a count column for var1
df[, var1count := .N, by = .(var1)]
#adding a count of each combo of var1 and var3
df[, var1and3comb := .N, by = .(var1,var3)]
I am curious as to how I could instead produce a count column that counts the number of records with a value that is within +- 5 from each value of var2.
In my non-functioning attempt at this,
df[, var2withinrange := .N, by = .(between((var2-5),(var2+5),var2))]
I get a column with the total number of records as opposed to the desired result. I'd be hoping for the first row to hold a value of 2, since the 1 and 5 fall into that range. Row 2 should have a value of 3, since the 1, 5, and 8 all fall into that range for the 5, and so on.
Any help on coming up with a solution is much appreciated. Ideally in data.table code!