I'm new at apache pig and wish to implement bottom-up-cubing by writing a pig script. However, this would require me to group in a hierarchial fashion.
For example, if my data is in the form of (exchange,symbol,date,dividend) where dividend is a measure and the rest are dimensions, I would like to first group data by exchange and print aggregate dividend and then further by exchange and symbol and so on.
One way to do this is to write all possible groupings in the script such as group by exchange, group by symbol, group by (exchange,symbol),etc. However, this appears to be unoptimal. Is there a way to (for example) first group by exchange, and then for every exchange group, internally group by symbol to generate aggregates for (exchange) and then for (exchange,symbol) as this would be more efficient.
Something similar is discussed here but it didn't answer my question : Can I generate nested bags using nested FOREACH statements in Pig Latin? Thanks!