1
votes

How Much time is taken by Apache Pig Query to execute? the query is in Pig Latin fetches records up to 4 million of tuples(rows) having 43 fields.

A = LOAD '/user/PigTest/year_14/mon_nov/6_sms_03_01.csv' USING PigStorage(',');
bt = foreach A generate $0 as id,$3;
dump bt;
ct = filter bt by id == 3981042 ;
dump ct;
dump MinutesBetween(CurrentTime(),$ti);

and calling file as: pig -param ti='date' try.pig

MY system environment is Linux.

The error is: ERROR 1200: mismatched input '(' expecting RIGHT_PAREN

org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1000: Error during parsing. mismatched input '(' expecting RIGHT_PAREN at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1725) at org.apache.pig.PigServer$Graph.access$000(PigServer.java:1420) at org.apache.pig.PigServer.parseAndBuild(PigServer.java:364) at org.apache.pig.PigServer.executeBatch(PigServer.java:389) at org.apache.pig.PigServer.executeBatch(PigServer.java:375) at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:170) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:232) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:203) at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:81) at org.apache.pig.Main.run(Main.java:608) at org.apache.pig.Main.main(Main.java:156) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) Caused by: Failed to parse: mismatched input '(' expecting RIGHT_PAREN

1

1 Answers

0
votes
   Two problems here
    1. You should print only the relation in DUMP stmt but you are trying to print the function MinutesBetween().
       If you remove the last line the error will be gone.
    2. In command line you are passing 'date' as parameter. In pig 'date' is not a buildin command. so you need to construct the date atleast any one of the format that pig supports.

    Example:
       I am using this date format '2014-11-06T06:01:13' and more date formats are available in the pig docs. you can check it.

    In command line
    >>pig -param ti='2014-11-06T06:01:13' -f try.pig 

    Change the last line of the pig script like this.
    test = FOREACH ct GENERATE MinutesBetween(CurrentTime(),ToDate('$ti'));
    DUMP test;

UPDATE:

Create one shell script say test.sh
1. Get the current time(ie start_time)
2. Call the pig script(try.pig)
3. Get the current time(ie, end_time)
4 Get the time diff and print it, so you will get the actual time taken by the pig script. You can modify the script to include hours and millisecond also.

test.sh

    #!/bin/bash
    START_TIME=$(date +"%s")

    pig -x local try.pig

    END_TIME=$(date +"%s") 
    DIFF=$(($END_TIME-$START_TIME))
    echo "$(($DIFF / 60)) minutes and $(($DIFF % 60)) seconds."

Sample output:

0 minutes and 2 seconds.