I have a data frame that looks like this (these are the first 24 rows):
row Name Age 1 John 22 2 Alice 29 3 Michael 33 4 Briefing NA 5 Class A 6 Year 2016 7 Observation 0 8 End NA 9 Steward 35 10 Louis 20 11 Josh 22 12 Marie 39 13 Briefing NA 14 Observation 2 15 Year 2017 16 End NA 17 Adam 27 18 Joseph 26 19 Andrew 26 20 Briefing NA 21 Observation 2 22 Year 2017 23 Class B 24 End NA
I want it to look like this
Name Age Class Year Observation John 22 A 2016 0 Alice 29 A 2016 0 Michael 33 A 2016 0 Steward 35 NA 2017 2 Louis 20 NA 2017 2 Josh 22 NA 2017 2 Marie 39 NA 2017 2 Adam 27 B 2017 2 Joseph 26 B 2017 2 Andrew 26 B 2017 2
I understand that this process requires me to:
- Detect every range where the data must be transferred: from the word "Briefing" to the word "End".
- I also need to sort the information into their respective columns since the components change: the first group would transfer information to the columns "Class", "Year" and "Observation" while the second group only to the columns "Year" and "Observation" leaving the column "Class" with an NA value. The third group once more provides information for the columns "Class", "Year" and "Observation".
I would be grateful for your help to tackle this task with R.
Edit: Here is the dput
structure(list(row = 1:24, Name = structure(c(7L, 2L, 12L, 4L,
5L, 15L, 13L, 6L, 14L, 10L, 9L, 11L, 4L, 13L, 15L, 6L, 1L, 8L,
3L, 4L, 13L, 15L, 5L, 6L), .Label = c("Adam", "Alice", "Andrew",
"Briefing", "Class", "End", "John", "Joseph", "Josh", "Louis",
"Marie", "Michael", "Observation", "Steward", "Year"), class = "factor"),
Age = structure(c(6L, 9L, 10L, NA, 13L, 4L, 1L, NA, 11L,
3L, 6L, 12L, NA, 2L, 5L, NA, 8L, 7L, 7L, NA, 2L, 5L, 14L,
NA), .Label = c("0", "2", "20", "2016", "2017", "22", "26",
"27", "29", "33", "35", "39", "A", "B"), class = "factor")), .Names = c("row",
"Name", "Age"), class = "data.frame", row.names = c(NA, -24L))
dput
to post your data so we can work with it? – gfgm