2
votes

I am learning PIG, and have the following statements :

> jan = LOAD 'hdfs:/201001hourlyx.txt USING PigStorage(',');
> feb = LOAD 'hdfs:/201002hourlyx.txt USING PigStorage(',');
> month_quad = UNION jan,feb;
> STORE month_quad INTO 'hdfs:/month_quad';
> SPLIT month_quad INTO split_jan IF (SUBSTRING(data, 4, 6) == '01');

I am getting the following error on the SPLIT

ERROR org.apache.pig.tools.grunt.GRUNT - ERROR 1200 < line 5 column 67 > syntax error, unexpected symbol at or near ':'

Does someone have a suggestion on what is the syntax error ?

1

1 Answers

2
votes

In Pig Latin, SPLIT cannot work with single condition. SPLIT is meant to Partition a relation into two or more relations.

Syntax:

SPLIT alias INTO alias IF expression, alias IF expression [, alias IF expression …];

Minimum two expressions are required to use SPLIT operator, whereas your statement contains only one and Pig expects for more.

Either add an additional IF clause to the statement or use its FILTER equivalent.

split_jan = FILTER month_quad by (SUBSTRING(data, 4, 6) == '01');