2
votes

Lets say I have a data set that is

0,11,2,3,4,5,56,7
0,1,21,13,45,5,61,75
01,1,2,3,54,55,6,75

What I am looking to do is flatmap the values to a key being the column index and the value being the value. Can anyone give me guidance? I'm finding it hard to get the column index.

2

2 Answers

2
votes

assuming that your RDD is of a sequence type, you could do:

rdd.flatMap(line => line.zipWithIndex.map(tuple => tuple.swap))
1
votes

Creating key value pairs, where the key is the list-index and the value is the value at that index could look like this:

rdd.flatMap(lambda x: enumerate(x))

This is of course assuming that your data is already an RDD.