1
votes

I am new to hadoop and pig. I am trying to run a sample pig script in a CentOS6 enviroment on VMWARE:

records = LOAD '2013_subset.csv' USING PigStorage(',') AS              
(Year,Month,DayofMonth,DayOfWeek,DepTime,CRSDepTime,ArrTime,\               
CRSArrTime,UniqueCarrier,FlightNum,TailNum,ActualElapsedTime,\              
CRSElapsedTime,AirTime,ArrDelay,DepDelay,Origin,Dest,\              
Distance:int,TaxiIn,TaxiOut,Cancelled,CancellationCode,\              
Diverted,CarrierDelay,WeatherDelay,NASDelay,SecurityDelay,\              LateAircraftDelay);milage_recs = 
GROUP records ALL;tot_miles = FOREACH milage_recs GENERATE SUM(records.Distance);STORE tot_miles INTO /user/root/totalmiles;

This code is save to a file called totalmiles.pig. After it runs, it completes with the following error:

ERROR org.apache.pig.tools.grunt.GRUNT - -ERROR: Unexpected character '\'

When remove the characters '\' from the code, I get a different error:

ERROR org.apache.pig.tools.grunt.GRUNT - -ERROR: mismatched input '/' expecting QUOTEDSTRING

I have not been able to find a solution to this particular error.I have run this on a different VM (virtulabox) under Centos7 and received a different error a parameter subsitution :i . I was hoping that someone might be able to shed some light on this.

Thanks! wasmithpfs

3
@iMassakre As your comment does not help in answering the question I have flagged it as non constructive. Please don't spam in the comments.Dennis Jaheruddin

3 Answers

0
votes

Remove the backslashes i.e. "\" and in the store statement the path must be enclosed in quotes.

records = LOAD '2013_subset.csv' USING PigStorage(',') AS (Year,Month,DayofMonth,DayOfWeek,DepTime,CRSDepTime,ArrTime,RSArrTime,UniqueCarrier,FlightNum,TailNum,ActualElapsedTime,CRSElapsedTime,AirTime,ArrDelay,DepDelay,Origin,Dest,Distance:int,TaxiIn,TaxiOut,Cancelled,CancellationCode,Diverted,CarrierDelay,WeatherDelay,NASDelay,SecurityDelay,LateAircraftDelay);
milage_recs = GROUP records ALL;
tot_miles = FOREACH milage_recs GENERATE SUM(records.Distance);
STORE tot_miles INTO '/user/root/totalmiles';
0
votes

The error seems to be quite clear:

The characters \ should not be there.

After solving that, the code can compile a bit further and you run into the next error:

Where you have a / a quoted string is expected.

Try to indicate the path with quotes, like:

'/user/root/totalmiles'
0
votes

There are two issues: 1. In load statement, you don't need '\'. Query parser can handle newline. Try below load statement.

records = LOAD '2013_subset.csv' USING PigStorage(',') AS (Year, Month, DayofMonth,
DyOfWeek, DepTime, CRSDepTime, ArrTime, CRSArrTime, UniqueCarrier, FlightNum, TailNum,
ActualElapsedTime, CRSElapsedTime, AirTime, ArrDelay, DepDelay, Origin, Dest, Distance:int,
TaxiIn, TaxiOut, Cancelled, CancellationCode, Diverted, CarrierDelay, WeatherDelay, 
NASDelay, SecurityDelay, LateAircraftDelay);
  1. In store statement you should have single quotes after INTO statements around output path. Try below:

    STORE tot_miles INTO '/user/root/totalmiles';