
I have a data frame like



I want to make separate columns for each value of loc3, with rows defined by loc1,loc2,tr1,tr2,tr3,Birth, and Species. I want to 'count' the statuses of all the observations that share these values and group the counts by loc3.

I planned to use dcast from the reshape2 package.

I wrote a function to perform the 'count' I want. I'm new to R and while I'm sure there is a function that does this, I couldn't find it immediately and it seemed a worthwhile exercise to try and write the script myself, given the simplicity of the task.

  for (i in 1:length(x))
    if (is.na(x{i])){
    }else if(x[i]==0){
    } else if(x[i]==1){

0s should increase the count and 1s and NAs shouldn't.



I get the error

Error in if (is.na(x[i])) { : argument is of length zero

Which makes me think I do not understand how dcast is treating fun.aggregate...

Thanks for the help! -JJE


1 Answers


Why not something like this using the tabulate function

dcast(df, ... ~ loc3, value.var = "Status", fun.aggregate = tabulate)

##         date loc1 loc2 tr1 tr2 tr3 Birth Species 1 2
## 1  1/27/2010    9    E   0   0   1 early       A 0 0
## 2  1/27/2010    9    E   0   0   1 early       B 0 0
## 3  1/27/2010    9    N   0   0   1 early       B 0 0
## 4  1/27/2010    9    N   0   0   1  late       A 0 0
## 5  1/27/2010    9    W   0   0   1 early       B 0 0
## 6  1/27/2010    9    W   0   0   1  late       A 0 0
## 7  1/27/2010   10    E   0   1   2  late       A 0 0
## 8  1/27/2010   10    E   0   1   2  late       B 0 2
## 9  1/27/2010   10    N   0   0   1  late       A 0 0
## 10 1/27/2010   10    N   0   1   2  late       B 0 2
## 11 1/27/2010   10    W   0   1   2  late       A 0 0
## 12 1/27/2010   10    W   0   1   2  late       B 0 0
## 13 1/27/2010   11    E   0   1   2  late       A 0 0
## 14 1/27/2010   11    E   1   0   3 early       B 0 2
## 15 1/27/2010   11    N   0   1   2 early       B 0 0
## 16 1/27/2010   11    N   0   1   2  late       A 0 0
## 17 1/27/2010   11    W   1   0   3  late       A 0 0
## 18 1/27/2010   11    W   1   0   3  late       B 0 2
## 19 1/27/2010   12    E   1   0   3 early       B 0 0
## 20 1/27/2010   12    E   1   0   3  late       A 0 0
## 21 1/27/2010   12    N   1   0   3 early       A 2 0
## 22 1/27/2010   12    N   1   0   3 early       B 0 2
## 23 1/27/2010   12    W   1   0   4 early       A 0 0
## 24 1/27/2010   12    W   1   1   4 early       B 0 0
## 25 1/27/2010   13    E   1   1   4 early       B 0 0
## 26 1/27/2010   13    E   1   1   4  late       A 0 0
## 27 1/27/2010   13    N   1   1   4  late       A 0 0
## 28 1/27/2010   13    N   1   1   4  late       B 0 2
## 29 1/27/2010   13    W   1   1   4 early       A 0 0
## 30 1/27/2010   13    W   1   1   4 early       B 0 2


If you want to count the number of 0 for example :

dcast(df, ... ~ loc3, value.var = "Status", 
         fun.aggregate = function(x) sum(x == 0, na.rm = TRUE))