I have a data set that looks like this
id name year job job2
1 Jane 1980 Worker 0
1 Jane 1981 Manager 1
1 Jane 1982 Manager 1
1 Jane 1983 Manager 1
1 Jane 1984 Manager 1
1 Jane 1985 Manager 1
1 Jane 1986 Boss 0
1 Jane 1987 Boss 0
2 Bob 1985 Worker 0
2 Bob 1986 Worker 0
2 Bob 1987 Manager 1
2 Bob 1988 Boss 0
2 Bob 1989 Boss 0
2 Bob 1990 Boss 0
2 Bob 1991 Boss 0
2 Bob 1992 Boss 0
Here, job2
denotes a dummy variable indicating whether a person was a Manager
during that year or not. I want to do two things to this data set: first, I only want to preserve the row when the person became Boss
for the first time. Second, I would like to see cumulative years a person worked as a Manager
and store this information in the variable cumu_job2
. Thus I would like to have:
id name year job job2 cumu_job2
1 Jane 1980 Worker 0 0
1 Jane 1981 Manager 1 1
1 Jane 1982 Manager 1 2
1 Jane 1983 Manager 1 3
1 Jane 1984 Manager 1 4
1 Jane 1985 Manager 1 5
1 Jane 1986 Boss 0 0
2 Bob 1985 Worker 0 0
2 Bob 1986 Worker 0 0
2 Bob 1987 Manager 1 1
2 Bob 1988 Boss 0 0
I have changed my examples and included the Worker position because this reflects more what I want to do with the original data set. The answers in this thread only works when there are only Managers and Boss in the data set - so any suggestions for making this work would be great. I'll be very much grateful!!