1
votes

When should I use flatten in pig? Not able to understand from the documentation. The error messages shown and the issue are entirely different in Pig. It says sometimes flatten could not be imported but the same flatten works somewhere.

1
Please revise the question with the exceptions you're getting.Virmundi
Also, please add the code you have tried to execute.Bart Schuijt

1 Answers

0
votes

Whenever you use group command for any of the identfier in your data file ,it will list down all the tuples pertaining to the identifier in a bag, which sometimes is quite cumbersome to read. So if you use flatten on top of the group clause it will list all the tuples separately in your output file .The drawback of using flatten is dulplicacy of the same record.So to remove dulpicate you need to write an extra piece of code.

Example of Non-flattened code:

X = GROUP A BY f1;

DUMP X;

(1,{(1,2,3)})

(4,{(4,2,1),(4,3,3)})

(8,{(8,3,4)})

Example of flattened code:

X = GROUP A BY f1;

DUMP X;

(1,2,3)

(4,2,1)

(4,3,3)

(8,3,4)