0
votes

What does group do in the following two examples from http://pig.apache.org/docs/r0.7.0/piglatin_ref2.html#FOREACH:

Example: Nested Projection

In this example if one of the fields in the input relation is a tuple, bag or map, we can perform a projection on that field (using a deference operator).

X = FOREACH C GENERATE group, B.b2;

DUMP X; (1,{(3)}) (4,{(6),(9)}) (8,{(9)})

In this example multiple nested columns are retained.

X = FOREACH C GENERATE group, A.(a1, a2);

DUMP X; (1,{(1,2)}) (4,{(4,2),(4,3)}) (8,{(8,3),(8,4)})

Is there any difference between using group and using GROUP?

Examples:

Example: Flattening

In this example the FLATTEN operator is used to eliminate nesting.

X = FOREACH C GENERATE group, FLATTEN(A);

DUMP X; (1,1,2,3) (4,4,2,1) (4,4,3,3) (8,8,3,4) (8,8,4,3)

Another FLATTEN example.

X = FOREACH C GENERATE GROUP, FLATTEN(A.a3);

DUMP X; (1,3) (4,1) (4,3) (8,4) (8,3)

1

1 Answers

0
votes

From http://pig.apache.org/docs/r0.11.0/basic.html#GROUP

The first field is named "group" (do not confuse this with the GROUP operator) and is the same type as the group key.

Field names are case sensitive so there will not be a field "GROUP" created by a GROUP operator. In my opinion it is a minor error in the pig documentation.