0
votes

I have a simple tuple (userid, country, amount, transactionid, date, crap1, crap2,crap3)

I am using FILTER to filter out some data and at the same time I want to drop some elements from the tuple. The reason they exist in the tuple is cause I need them at the some earlier point but not after the filter.

currently i am doing

B = FILTER A by date == 'xxxx';
C = FOREACH B GENERATE name, country, tranactionid;

Is it possible to do it in one statement (to speed up the query), because as I understand FOREACH + FILTER + GENERATE only work on nested bags.

1
Why you are sure that it will speed up the query while there is a powerful optimizer in Pig ? Do you check your explain plans ?54l3d
i assumed that there will be only one run through the tuples instead of 2 timesAnand
I am still figuring out how to interpret the 'explain' plansAnand

1 Answers

0
votes

Not possible..

FILTER alias  BY expression;

and

FOREACH { gen_blk | nested_gen_blk } [AS schema];