I'm having a data.frame (df
), see example, that contains information about people. Based on a key column (sleutel
), I know if people live together (e.g. form a family) or not. Now, I need to create new columns with information about the 'head' of the family.
name sex gzverh sleutel gzhfd lft
1 Loekens Man 6 1847LS 9 3 49
2 Kemel Vrouw 5 1847LK 10 2 18
3 Kemel Man 5 1847LK 10 2 22
4 Boersma Vrouw 4 1847LK 10 2 52
5 Kemel Man 2 1847LK 10 1 54
So for example: row 5, Kemel, Male and gzhfd 1 (= head of the family Kemel). He is married to mrs. Boersma (same key). I want to mutate a new column (lfthb
) with the age of the head of the family for all family members. So should become something like:
name sex gzverh sleutel gzhfd lft lfthb
1 Loekens Man 6 1847LS 9 3 49 NA
2 Kemel Vrouw 5 1847LK 10 2 18 54
3 Kemel Man 5 1847LK 10 2 22 54
4 Boersma Vrouw 4 1847LK 10 2 52 54
5 Kemel Man 2 1847LK 10 1 54 54
I tried multiple ways with dplyr
using multiple combinations of group_by
, case_when
and if_else
statements. And I manage to mutate the column for the head of the family itself. But not for the other members.
For example, evidently only changes the value for the head itself:
df <- df %>% mutate(lfthb = case_when(sleutel == lag(sleutel) & gzhfd == 1 ~ lft))
But how to include the gzhfd == 1
after the ~
?
dput of example data:
structure(list(naam = c("Loekens", "Kemel", "Kemel", "Boersma",
"Kemel"), gesl = c("Man", "Vrouw", "Man", "Vrouw", "Man"), gzverh = c(6L,
5L, 5L, 4L, 2L), sleutel = c("1847LS 9", "1847LK 10", "1847LK 10",
"1847LK 10", "1847LK 10"), gzhfd = c(3, 2, 2, 2, 1), lft = c(49,
18, 22, 52, 54)), row.names = c(NA, 5L), class = "data.frame")