1
votes

I have a relation like this :

R1 : a:chararray,b:chararray,c:bag{t:tuple(c1:chararray,c2:chararray)}

so the data look like :

(a,b,{(aa,bb),(cc,dd)})  
(e,f,{(gg,hh),(ii,jj)})

And i want to get that :

R2 : c:bag{t:tuple(c1:chararray,c2:chararray,b:chararray,a:chararray,)}

So :

{(aa,bb,b,a),(cc,dd,b,a)})  
{(gg,hh,f,e),(ii,jj,f,e)}

I tried several solution with nested foreach and flatten the bag, i tried cross join ... but there isn't any good solution.

Especially I expected this should work :

FOREACH R1  {
    flatC= FOREACH R1 GENERATE FLATTEN(c) as c1,c2,c3;
GENERATE 
    a,
    b,
    c1,
    c2,
    c3;
};

Does anyone have an idea ?

thanks

1

1 Answers

0
votes

One option could be you can try like this.

input.txt

a       b       {(aa,bb),(cc,dd)}
e       f       {(gg,hh),(ii,jj)}

PigScript:

A= LOAD 'input.txt' AS (a:chararray,b:chararray,c:bag{(c1:chararray,c2:chararray)});
B = FOREACH A GENERATE FLATTEN(c),b,a;
C = GROUP B BY (b,a);
D = FOREACH C GENERATE $1;
DUMP D;
DESCRIBE D;

Output:

({(aa,bb,b,a),(cc,dd,b,a)})
({(gg,hh,f,e),(ii,jj,f,e)})