0
votes

I need to shift Col4 left based on columns Col2 and Col3 in the dataframe in pyspark. Col4 value should be changed when consecutive col2 changes. Col3 mainly track new sequence of Col2 value. Col1 should also partition the final output. The output should look like shift_col4.

ID  Col1 Col2 Col3 Col4 shift_col4
1    1   10   1    4     null
2    1   11   1    8     4
3    1   12   1   12     8
4    1   1    2   16     12
5    1   2    2   20     16
4    2   1    1   16     null
5    2   2    1   20     16