Still a beginner in Scala and Spark, I think I'm just being brainless here. I have two RDDs, one of the type :-
((String, String), Int) = ((" v67430612_serv78i"," fb_201906266952256"),1)
Other of the type :-
(String, String, String) = (r316079113_serv60i,fb_100007609418328,-795000)
As it can be seen, the first two columns of the two RDDs are of the same format. Basically they are ID's, one is 'tid' and the other is 'uid'.
The question is this :
Is there a method by which I can compare the two RDDs in such a manner that the tid and uid are matched in both and all the data for the same matching ids is displayed in a single row without any repetitions?
Eg : If I get a match of tid and uid between the two RDDs
((String, String), Int) = ((" v67430612_serv78i"," fb_201906266952256"),1)
(String, String, String) = (" v67430612_serv78i"," fb_201906266952256",-795000)
Then the output is:-
((" v67430612_serv78i"," fb_201906266952256",-795000),1)
The IDs in the two RDDs are not in any fixed order. They are random i.e the same uid and tid serial number may not correspond in both the RDDs.
Also, how will the solution change if the first RDD type stays the same but the second RDD changes to type :-
((String, String, String), Int) = ((daily_reward_android_5.76,fb_193055751144610,81000),1)
I have to do this without the use of Spark SQL.