I have a dataframe where I want to give id's in each Window partition. For example I have
id | col |
1 | a |
2 | a |
3 | b |
4 | c |
5 | c |
So I want (based on grouping with column col)
id | group |
1 | 1 |
2 | 1 |
3 | 2 |
4 | 3 |
5 | 3 |
I want to use a window function but I cannot find anyway to assign an Id to each window. I need something like:
w = Window().partitionBy('col')
df = df.withColumn("group", id().over(w))
Is there any way to achive somethong like that. (I cannot simply use col as a group id because I am interested in creating a window over multiple columns)