My data looks like below.
+------------------+--------------------+----------------+
| out| timestamp|Sequence|
+------------------+--------------------+----------------+
|0.5202757120132446|2019-11-07 00:00:...| 1|
| null|2019-11-07 00:00:...| 2|
| null|2019-11-07 00:00:...| 3|
| null|2019-11-07 00:00:...| 4|
|0.5220348834991455|2019-11-07 00:00:...| 5|
| 0.724998414516449|2019-11-07 00:00:...| 6|
| null|2019-11-07 00:00:...| 7|
| null|2019-11-07 00:00:...| 8|
|0.7322611212730408|2019-11-07 00:00:...| 9|
| null|2019-11-07 00:00:...| 10|
| null|2019-11-07 00:00:...| 11|
Now i want replace the null values with previous sequence value . I'm using windows function to acheive this but i'm getting following error
'Window Frame RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW must match the required frame ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING;'
My code:
window1 =Window.partitionBy('timestamp').orderBy('Sequence').rangeBetween(Window.unboundedPreceding,0)
df = df.withColumn('out',F.when(F.col('out').isNull(),F.lag('out').over(window1)).otherwise(F.col('out')))