0
votes

I have data generated as part r from mapreduce job in the following format:

(19,[2468:5.0,1894:5.0,3173:5.0,3366:5.0,3198:5.0,1407:5.0,407:5.0,1301:5.0,2153:5.0,3007:5.0])
(20,[3113:5.0,3285:5.0,3826:5.0,3755:5.0,373:5.0,3510:5.0,3300:5.0,22:5.0,1358:5.0,3273:5.0])

19 and 20 are users ids and array within the [] are recommendations for the users, each recommendation separated by comma. I want to load this data in a tabular format - row 1 =19,2468,5.0,3175, row 2 = 19, 1894, 5.0, 3173 and so on.

How could I achieve this by Pig or Hive?

1
Can you confirm, mentioned output is the required one? - sumitya
What did you tried so far ? - Mahendra

1 Answers

0
votes

So far, I have tried in Pig but haven't been able to parse to get the desired output.

I am looking to create a report where I can display the user name (by joining with the user table), recommended movie names for the user (by joining the movie table) and the user rating.

In the data above, 19 is the user id. Within the parentheses are recommended movie ids for that user along with rating. Each recommendation is separated by a comma.