I have 2 paired RDDs like below
RDD1 contains name as key and zipcode as value:
RDD1 -> RDD( (ashley, 20171), (yash, 33613), (evan, 40217) )
RDD2 contains zip code as key and some random number as value:
RDD2 -> RDD( (20171, 235523), (33613, 345345345), (40189, 44355217), (40122, 2345235), (40127, 232323424) )
I need to replace the zipcodes in RDD1 with the corresponding values from RDD2. So the output would be
RDD3 -> RDD( (ashley, 235523), (yash, 345345345), (evan, 232323424) )
I tried doing it using the RDD lookup method like below but I got exception saying that RDD transformations cannot be perfomed inside another RDD transformation
val rdd3 = rdd1.map( x => (x._1, rdd2.lookup(x._2)(0)) )
rdd2
is small, you could collect it on the driver and broadcast it, then what you're trying would be possible. Otherwise, you'll probably have to play with joins to achieve what you want. – vanza