I have four columns from a data table df
I would like to base a fifth column off of. The four current columns are - year
, month
, id
, and conflict
. Right now the conflict
column has only 1's and 0's and for a given id grouping once a 1 occurs in a year then 1's occur for the rest of that year's months. I want to mutate the conflict
column into a new column conflict_mutated
like so: If we are on a given year which contains a 1 on any month AND the previous year contains a 1 on any month, I want the current year's months to be all 1's for conflict_mutated
, while also keeping all the old 1s.
So if we have data like:
year month id conflict
1989 6 33 0
1989 7 33 0
1989 8 33 1
1989 9 33 1
1989 10 33 1
1989 11 33 1
1989 12 33 1
1990 1 33 0
1990 3 33 0
1990 3 33 0
1990 4 33 0
1990 5 33 1
1990 6 33 1
1990 7 33 1
1990 8 33 1
1990 9 33 1
1990 10 33 1
1990 11 33 1
1990 12 33 1
And so I want the 0's in conlfict
during months 1, 2, 3, and 4 to be 1's since they are the same id and both 1989 (the previous year) and 1990 had 1's in them. The previous example data would then look like this:
year month id conflict conflict_mutated
1989 6 33 0 0
1989 7 33 0 0
1989 8 33 1 1
1989 9 33 1 1
1989 10 33 1 1
1989 11 33 1 1
1989 12 33 1 1
1990 1 33 0 1
1990 3 33 0 1
1990 3 33 0 1
1990 4 33 0 1
1990 5 33 1 1
1990 6 33 1 1
1990 7 33 1 1
1990 8 33 1 1
1990 9 33 1 1
1990 10 33 1 1
1990 11 33 1 1
1990 12 33 1 1
I have a solution but it would take almost 3 days to complete. It is as follows:
conflict_mutated = df$conflict
for (i in 1:length(nrow(df)) {
if (df$year[i] != 1989 & any(filter(df, id == df$id[i],
year == (df$year[i] - 1))$conflict == 1) &
any(filter(df, id == df$id[i], year == df$year[i])$conflict == 1))
{conflict_mutated[i] = 1}
Is there any way to utilize group_by and mutate to make this faster or better? Having trouble thinking of how this would be done considering grouped years have to be taken into account and shifted within the conditional logic combined with varying id's.