0
votes

I want to concatenate all records using Pig. After load in the data with "pigStorage" and '-tagFile' label, my data looks like:

(filename, aaaaaaaaaaa)
(filename, bbbbbbbbbbbbbb)

And the result I prefer is:

(filename, aaaaaaaaaaabbbbbbbbbbbbbb)

Then I can store the data into HBase with filename as rowkey.

Any suggestion will be appreciated.

1

1 Answers

0
votes

GROUP the data by the filename and then use BagToString to CONCAT all bags to a single string.

B = GROUP A BY filename;
C = FOREACH B GENERATE group,BagToString(A.$1,'');
DUMP C;