I am trying to learn dplyr, and I cannot find an answer for a relatively simple question on Stackoverflow or the documentation. I thought I'd ask it here.
I have a data.frame that looks like this:
set.seed(1)
dat<-data.frame(rnorm(10,20,20),rep(seq(5),2),rep(c("a","b"),5))
names(dat)<-c("number","factor_1","factor_2")
dat<-dat[order(dat$factor_1,dat$factor_2),]
dat<-dat[c(-3,-7),]
number factor_1 factor_2
1 7.470924 1 a
6 3.590632 1 b
2 23.672866 2 b
3 3.287428 3 a
8 34.766494 3 b
4 51.905616 4 b
5 26.590155 5 a
10 13.892232 5 b
I would like to use dplyr to subtract the values number column associated with factor_2=="b" from factor_2=="a" within each level of factor one.
The first line of the resulting data.frame would look like:
diff factor_1
1 3.880291 1
A caveat is that there are not always values for each level of factor_2 within each level of factor_1. Should this be the case, I would like to assign 0 to the number associated with the missing factor level.
Thank you for your help.