4
votes

I have these two Key-value RDDs in spark:

rdd1 = [(u'Key1', 4), (u'Key2', 6), (u'Key3', 10)]
rdd2 = [(u'Key1', 4), (u'Key2', 3), (u'Key3', 2)]

And I looking the spark function to get the division of the values: (rdd3= (rdd1/rdd2))

In this case:

rdd3 = [(u'Key1', 1), (u'Key2', 2), (u'Key3', 5)]
1

1 Answers

5
votes

You can join and mapValues:

rdd1.join(rdd2).mapValues(lambda x: x[0] / x[1])