1
votes

Say I have a relation as follow:

(A, (1, 2, 3))
(B, (2, 3))

Is it possible to make a new relation by expanding the bag element as follow using Pig Latin?

(A, 1)
(A, 2)
(A, 3)
(B, 2)
(B, 3)

I tried using FOREACH and GENERATE, but I am having difficulty generating a new tuple while looping through a bag element.

Thanks,

------------- EDIT -------------

Here's a sample input:

A    1 2 3
B    2 3

Separated by tab and then a whitespace.

I used STRSPLIT to handle whitespace to generate a tuple.

raw_x = LOAD './sample.txt' using PigStorage('\t') AS (title:chararray, links:chararray);
data_x = FOREACH raw_x GENERATE title, STRSPLIT(links, '\\s+') AS links;
1
Its looks like your tuple fields are not in fixed length. Can you paste your pig script and sample input?Sivasakthi Jayaraman
Yes, the tuple fields are not in fixed length. I added sample input and script. I'm guessing maybe I should take a different approach to handle this case? Any form of suggestion would be appreciated!wns349
Updated the solution,please check it.Sivasakthi Jayaraman

1 Answers

1
votes

Can you try this?

input.txt

A       1 2 3
B       2 3

PigScript:

A = LOAD 'input.txt' USING PigStorage() AS (title:chararray,links:chararray);
B = FOREACH A GENERATE title,FLATTEN(TOKENIZE(links));
DUMP B;

Output:

(A,1)
(A,2)
(A,3)
(B,2)
(B,3)