1
votes

This is my input data:

Program = c("A","A","A","B","B","C")
Age = c(10,30,30,12,32,53)
Gender = c("F","F","M","M","M","F")
Language = c("Eng","Eng","Kor","Kor","Other","Other")
df = data.frame(Program,Age,Gender,Language)

I would like to output a table like this:

Program MEAN AGE ENG KOR FEMALE MALE
A
B
C

Where MEAN AGE is the average age, ENG,KOR,FEMALE,MALE are counts.

I have tried using dplyr and t() but in this case I feel like I'm completely lost as to what are the steps (my first post, new to this). Thank you in advance!

2

2 Answers

2
votes

You can take the following approach:

library(dplyr)

df %>%
  group_by(Program) %>%
  summarise(
    `Mean Age` = mean(Age),
    ENG = sum(Language=="Eng"),
    KOR = sum(Language=="Kor"),
    Female = sum(Gender=="F"),
    Male = sum(Gender=="M"),
    .groups="drop"
  )

Output:

# A tibble: 3 x 6
  Program `Mean Age`   ENG   KOR Female  Male
  <chr>        <dbl> <int> <int>  <int> <int>
1 A             23.3     2     1      2     1
2 B             22       0     1      0     2
3 C             53       0     0      1     0

Note: .groups is a special variable for dplyr functions. The way it's used here is equivalent to using %>% ungroup() after the calculation. If you type any other name in the summarise function, it will assume it's a column name.

0
votes

In base R you could do:

df1 <- cbind(df[1:2], stack(df[3:4])[-2])
cbind(aggregate(Age~Program, df, mean),as.data.frame.matrix(table(df1[-2])))
  Program      Age Eng F Kor M Other
A       A 23.33333   2 2   1 1     0
B       B 22.00000   0 0   1 2     1
C       C 53.00000   0 1   0 0     1