I am having trouble to prepare a summary table using dplyr
based on the data set below:
set.seed(1)
df <- data.frame(rep(sample(c(2012,2016),10, replace = T)),
sample(c('Treat','Control'),10,replace = T),
runif(10,0,1),
runif(10,0,1),
runif(10,0,1))
colnames(df) <- c('Year','Group','V1','V2','V3')
I want to calculate the mean, median, standard deviation and count the number of observations by each combination of Year
and Group
.
I have successfully used this code to get mean
, median
and sd
:
summary.table = df %>%
group_by(Year, Group) %>%
summarise_all(funs(n(), sd, median, mean))
However, I do not know how to introduce the n()
function inside the funs()
command. It gave me the counting for V1
, V2
and V3
. This is quite redundant, since I only want the size of the sample. I have tried introducing
mutate(N = n()) %>%
before and after the group_by()
line, but it did not give me what I wanted.
Any help?
EDIT: I had not made my doubt clear enough. The problem is that the code gives me columns that I do not need, since the number of observations for V1
is sufficient for me.
()
aftern
to make the code workable? like this:summarise_all(funs(n(),sd,median,mean))
– raymkchow... %>% summarise_all(funs(sd,median,mean)) %>% mutate(n = n())
– raymkchow