I have a data frame made of 420 rows and 37 columns from insect field sampling data. I am trying to sum column values every 5 rows so that every 5 rows becomes just 1. In addition I am trying to make the non numerical values in my dataframe($site, $date, $plot) "collapse" so they fit the single new row made of the sums I mentioned.
I have tried using advice from this thread:
Summing columns on every nth row of a data frame in R
Including using package dplyr's summarize_each function as well as gl and colSums that are part of data.table.
I have tried these on my whole dataframe
library(data.table) setDT(FinalData)[, as.list(colSums(.SD)), by = gl(ceiling(420/5), 5, 420)]
this gives me:
Error in colSums(.SD) : 'x' must be numeric
and
library(dplyr)
FinalData %>%
group_by(indx = gl(ceiling(420/5), 5, 420)) %>%
summarise_each(funs(sum))
which gives me error:
Error in Summary.factor(c(4L, 4L, 4L, 4L, 4L), na.rm = FALSE) :
‘sum’ not meaningful for factors
Site.Date.Plot CarA CarB CarC...
1.SL.VI.1 0 0 1
2.SL.VI.1 0 0 0
3.SL.VI.1 0 6 0
4.SL.VI.1 0 0 3
5.SL.VI.1 1 0 0
...
every 5 rows has a different $Site.Date.Plot. I expect this:
Site.Date.Plot CarA CarB CarC...
1. SL.VI.1 1 6 4
2. SL.VI.2 ... ... ...
But I get the above error messages from above.
dput
so people can copy and paste. Also, including 20ish rows may have been better just to verify that every 5 rows there is a different Site.Date.Plot. So,dput(FinalData[1:20, 1:4])
would have been great. – Cole