1
votes

I have a simple data.frame that looks like this:

Group     Person  Score_1   Score_2   Score_3
1         1       90        80        79
1         2       74        83        28
1         3       74        94        89
2         1       33         9         8
2         2       94        32        78
2         3       50        90        87

I need to first need to find the mean of Score_1, collapsing across persons within a group (i.e., the Score_1 mean for Group 1, the Score_1 mean for Group 2, etc.), and then I need to collapse across all both groups to find the mean of Score_1. How can I calculate these values and store them as individual objects? I have used the "summarise" function in dplyr, with the following code:

summarise(group_by(data,Group),mean(bias,na.rm=TRUE))

I would like to ultimately create a 6th column that gives the mean, repeated across persons for each group, and then a 7th column that gives the grand mean across all groups.

I'm sure there are other ways to do this, and I am open to suggestions (although I would still like to know how to do it in dplyr). Thanks!

3
You need mutate instead of summariseakrun

3 Answers

1
votes

To create a column, we use mutate and not summarise. We get the grand mean (MeanScore1), then grouped by 'Group', get the mean by group ('MeanScorebyGroup') and finally order the columns with select

library(dplyr)
df1 %>% 
    mutate(MeanScore1 = mean(Score_1)) %>%
    group_by(Group) %>% 
    mutate(MeanScorebyGroup = mean(Score_1)) %>%
    select(1:5, 7, 6)

But, this can also be done using base R in simple way

df1$MeanScorebyGroup <- with(df1, ave(Score_1, Group))
df1$MeanScore1 <- mean(df1$Score_1)
2
votes

data.table is good for tasks like this:

library(data.table)

dt <- read.table(text = "Group     Person  Score_1   Score_2   Score_3
           1         1       90        80        79
           1         2       74        83        28
           1         3       74        94        89
           2         1       33         9         8
           2         2       94        32        78
           2         3       50        90        87", header = T)

dt <- data.table(dt)

# Mean by group
dt[, score.1.mean.by.group := mean(Score_1), by = .(Group)]
# Grand mean
dt[, score.1.mean := mean(Score_1)]
dt
0
votes

@akrun you just blew my mind!

Just to clarify what you said, here's my interpretation:

library(plyr)

Group <- c(1,1,1,2,2,2)
Person <- c(1,2,3,1,2,3)
Score_1 <- c(90,74,74,33,94,50)
Score_2 <- c(80,83,94,9,32,90)
Score_3 <- c(79,28,89,8,78,87)

df <- data.frame(cbind(Group, Person, Score_1, Score_2, Score_3))

df2 <- ddply(df, .(Group), mutate, meanScore = mean(Score_1, na.rm=T))
mutate(df2, meanScoreAll=mean(meanScore))