I have two rdds that I need to join them together. They look like the followings:
RDD1
[(u'2', u'100', 2),
(u'1', u'300', 1),
(u'1', u'200', 1)]
RDD2
[(u'1', u'2'), (u'1', u'3')]
My desired output is:
[(u'1', u'2', u'100', 2)]
So I would like to select those from RDD2 that have the same second value of RDD1. I have tried join and also cartesian and none is working and not getting even close to what I am looking for. I am new to Spark and would appreciate any help from you guys.
Thanks