My input dataset looks like
id1, 10, v1
id2, 9, v2
id2, 34, v3
id1, 6, v4
id1, 12, v5
id2, 2, v6
and I want output
id1; 6,v4 | 10,v1 | 12,v5
id2; 2,v6 | 9,v2 | 34,v3
This is such that
id1: array[num(i),value(i)] where num(i) should be sorted
What I have tried:
Get id and 2nd column as key,
sortByKey, but since it's a string, sorting doesn't happen like a int, but as stringGet 2nd column as key,
sortByKey, then get id and key and 2nd column in value,reduceByKey. But in this case, while doingreduceByKey; order is not preserved. EvengroupByKeyis not preventing the order. Actually this is expected.
Any help will be appreciated.