0
votes

(Test for students, please don't reply ;))

Hi all,

Im kinda new to R and I can't find a solution to my problem.
I have two columns in my dataframe: Sex and Age. I want to know the mean age of each sex.
And I want this answer to be a 2 by 2 table.

What i tried:
I can find the mean of both groups, but R adds them as a column to my dataframe.
Also, I know how to make a table with the outcome that I want, but that's ofcourse not with the orginal dataset.

What I want is a table of 2x2:
Sex AVG_age
Male 21.2
Female 21.5

Here below my code:

library(dplyr)

set.seed(13)

Sex <- sample(c("Male","Female"), 100, replace=TRUE, prob = c(0.53, 0.47))
Age <- sample((18:25),100,replace=T)

# Output with extra column
df_sex_age <- data.frame(Sex,Age) %>% 
  group_by(Sex) %>% 
  mutate(Avg_Age = mean(Age))
View(df_sex_age)

# What I want
data.frame(Sex = c("Male", "Female"),
                       Avg_Age = c(21.2, 21.5))
1
Just replace mutate() with summarise(), i.e. ... %>% summarise(Avg_Age = mean(Age))Darren Tsai

1 Answers

2
votes

You want to replace mutate() with summarize(). You can also use summarise() as an alias. The mutate() function calculates new values and creates a new column to return a data.frame with the same number of rows as the original. The summarize() function aggregates data based on the grouping variables, and creates new summary columns. The resulting data.frame has only the number of rows as the unique combinations of grouping variables, and only the grouping columns plus the new aggregate columns.

data.frame(Sex, Age) %>%
  group_by(Sex) %>%
  summarize(Ave_age = mean(Age))
# A tibble: 2 x 2
  Sex    Ave_age
  <chr>    <dbl>
1 Female    21.3
2 Male      21.6