0
votes

I am using a panel-data where individuals are followed over time. I want to check whether respondents changed their answers from 1 year compared with another. For instance, the gender variable below represents 1 for male and 0 for female. Person with ID 1 has changed their answer from male to female between 2005 and 2006.

As i have millions of people in my data.frame, i would like to create a variable that gives me the value of 9 for respondents that changed their answer over time and a value of 8 for respondents who had a constant response. Could someone please guide me a bit how to achieve that using dplyr?

id  year    unemployment       change
1   2005        1                 9
1   2006        0                 9
1   2007        0                 9
2   2007        1                 8
2   2008        1                 8

structure(list(id = structure(c(1, 1, 1, 2, 2), format.stata = "%9.0g"), 
    year = structure(c(2005, 2006, 2007, 2007, 2008), format.stata = "%9.0g"), 
    unemployment = structure(c(1, 0, 0, 1, 1), format.stata = "%9.0g"), 
    change = structure(c(9, 9, 9, 8, 8), format.stata = "%9.0g")), row.names = c(NA, 
-5L), class = c("tbl_df", "tbl", "data.frame"))
1
Where is the gender variable?Karthik S

1 Answers

1
votes

If we assume we need to detect a change in unemployment, not gender, we can use something like:

d %>% 
  group_by(id) %>% 
  mutate(change = ifelse(n_distinct(unemployment) == 1, 8, 9))

However, I would not recommend using values like 8 and 9 to code such a change variable as it cannot be readily understood.