I have a dataset that looks something like this, where I have a different teams (A, B) for each classroom (ID), 3 scores for each team (e.g., team1_1 team1_2 team1_3) and a mean score for each team. In other words, Classroom (ID) 1 has two teams (A, B), and Classroom 2 has two teams (A, B), and so on.
df <- data.frame(ID=1:5,
Team1=c("A","A","A","A","A"),
Team1_1=c(2,3,1,3,4),
Team1_2=c(4,2,2,3,2),
Team1_3=c(5,3,4,5,4),
Team1_Mean=c(3.67,2.67,2.33,3.67,3.33),
Team2=c("B","B","B","B","B"),
Team2_1=c(4,3,3,5,4),
Team2_2=c(4,2,2,5,2),
Team2_3=c(4,4,4,5,3),
Team2_Mean=c(4,3,3,5,3))
ID Team1 Team1_1 Team1_2 Team1_3 Team1_Mean Team2 Team2_1 Team2_2 Team2_3 Team2_Mean
1 A 2 4 5 3.67 B 4 4 4 4
2 A 3 2 3 2.67 B 3 2 4 3
3 A 1 2 4 2.33 B 3 2 4 3
4 A 3 3 5 3.67 B 5 5 5 5
5 A 4 2 4 3.33 B 4 2 3 3
I would like to reshape the data so that each Team is listed in long format, which would look like this:
df2 <- data.frame(ID=c(1,1,2,2,3,3,4,4,5,5),
Team=c("A", "B", "A", "B", "A", "B", "A", "B", "A", "B"),
Score1=c(2,4,3,3,1,3,3,5,4,5),
Score2=c(4,4,2,2,2,2,3,5,2,2),
Score3=c(5,4,3,4,4,4,5,5,4,3),
Mean=c(3.67,4,2.67,3,2.33,3,3.67,5,3.33,3))
ID Team Score1 Score2 Score3 Mean
1 A 2 4 5 3.67
1 B 4 4 4 4.00
2 A 3 2 3 2.67
2 B 3 2 4 3.00
3 A 1 2 4 2.33
3 B 3 2 4 3.00
4 A 3 3 5 3.67
4 B 5 5 5 5.00
5 A 4 2 4 3.33
5 B 5 2 3 3.00
In my mind, I thought putting the data for each team in a list, reshaping into long format, and then unnesting the data would do the trick, but it wasn't that easy. My first step was as such:
df <- df %>% nest(items = c("Team1", "Team1_1", "Team1_2", "Team1_3", "Team1_Mean"))
names(df)[names(df) == "items"] <- "Team1"
df <- df %>% nest(items = c("Team2", "Team2_1", "Team2_2", "Team2_3", "Team2_Mean"))
names(df)[names(df) == "items"] <- "Team2"
# A tibble: 5 x 3
ID Team1 Team2
<int> <list> <list>
1 1 <tibble [1 × 5]> <tibble [1 × 5]>
2 2 <tibble [1 × 5]> <tibble [1 × 5]>
3 3 <tibble [1 × 5]> <tibble [1 × 5]>
4 4 <tibble [1 × 5]> <tibble [1 × 5]>
5 5 <tibble [1 × 5]> <tibble [1 × 5]>
From there I tried my usual reshape2 (melt) function but R says that's deprecated, so I tried tidyr's "gather" method, but it seems these efforts do not work on lists.
My questions are:
- How do I achieve my desired outcome?
- My dataset contains hundreds of IDs and teams, so I do not want to have to type out each variable as I did above; how can I streamline this to reduce the use of hard coding?
Much thanks in advance!