Not able to Execute hive script

Question

I have installed single node cluster hadoop and hive. I am able to load the data and display it in hive. I want to execute a script which creates temporary functions. I need to add jar file. The jar files are add esri-geometry-api.jar spatial-sdk-hive-1.0-MODIFIED.jar and HiveUDFs.jar

I refered: How to write a script file in Hive? I got this error: esri-geometry-api.jar does not exist

My configuration details:

$ echo $HADOOP_HOME:/home/hduser/hadoop-1.2.1
$ echo $JAVA_HOME:/usr/lib/java/jdk1.7.0_55
$ echo $:HIVE_HOME:/home/hduser/hadoop-1.2.1/hive-0.9.0-bin

java version "1.7.0_55"
Java(TM) SE Runtime Environment (build 1.7.0_55-b13)
Java HotSpot(TM) 64-Bit Server VM (build 24.55-b03, mixed mode)

hadoop version:

Hadoop 1.2.1
Subversion https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.2 -r 1503152
Compiled by mattf on Mon Jul 22 15:23:09 PDT 2013
From source with checksum 6923c86528809c4e7e6f493b6b413a9a
This command was run using /home/hduser/hadoop-1.2.1/hadoop-core-1.2.1.jar


HIVE VERSION: hive-0.9.0
hduser@ubuntu:~$ echo $HIVE_HOME
/home/hduser/hadoop-1.2.1/hive-0.9.0-bin

I have hive script which i need to execute as below: I have a data that has latitute longitude at the time interval of 5 seconds.

add jar esri-geometry-api.jar spatial-sdk-hive-1.0-MODIFIED.jar HiveUDFs.jar;
create temporary function ST_AsText as 'com.esri.hadoop.hive.ST_AsText';
create temporary function ST_Intersects as 'com.esri.hadoop.hive.ST_Intersects';
create temporary function ST_Length as 'com.esri.hadoop.hive.ST_Length';
create temporary function ST_LineString as 'com.esri.hadoop.hive.ST_LineString';
create temporary function ST_Point as 'com.esri.hadoop.hive.ST_Point';
create temporary function ST_Polygon as 'com.esri.hadoop.hive.ST_Polygon';
create temporary function ST_SetSRID as 'com.esri.hadoop.hive.ST_SetSRID';
create temporary function collect_array as 'com.zombo.GenericUDAFCollectArray';
SELECT
    id,
    unix_timestamp(dt) - unix_timestamp(fv)
FROM (
    SELECT
        id, dt, fv
    FROM (
        SELECT
            id, dt,
            FIRST_VALUE(dt) OVER (PARTITION BY id ORDER BY dt) as fv,
            ROW_NUMBER() OVER (PARTITION BY id ORDER BY dt DESC) as lastrk
        FROM
            uber
        ) sub1
    WHERE
        lastrk = 1
    ) sub2
WHERE
    (unix_timestamp(dt) - unix_timestamp(fv)) < 28800;

My questions are as below:

Do i need to start hadoop services before running HIVE as I observed that I can run HIVE directly without starting HADOOP services. If yes then what is the significance of having hadoop and how can I use it with hive?
When I try to add JAR manually it gives me below error: hive> ADD JAR esri-geometry-api.jar /home/hduser/hadoop_jar; esri-geometry-api.jar does not exist

hive> add jar esri-geometry-api.jar; esri-geometry-api.jar does not exist

I also added hive-site.xml as below:

<configuration>
<property>
<name>hive.aux.jars.path</name>
<value>file:///home/hduser/hadoop_jar/HIVEUDFs.jar,
file:///home/hduser/hadoop_jar/esri-geometry-api-1.0.jar,
file:///home/hduser/hadoop_jar/spatial-sdk-json-1.0.1-sources.jar</value>
</property>
</configuration>

I added the jar file to the lib folder of my hive directory in hadoop folder.

When I try to run script:

hduser@ubuntu:~/queries$ hive queries.hive

WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please use org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties files. Logging initialized using configuration in jar:file:/home/hduser/hadoop-1.2.1/hive-0.9.0-bin/lib/hive-common-0.9.0.jar!/hive-log4j.properties Hive history file=/tmp/hduser/hive_job_log_hduser_201404290234_597714109.txt

hive>
When i issue list jar; command it gives: file:/home/hduser/hadoop-1.2.1/hive-0.9.0-bin/lib/hive-builtins-0.9.0.jar
I need to execute the script. Please help.

SachinJ SachinJ · Accepted Answer · 2014-04-29T10:59:32

The reason why you are not able to execute the script is -f option is missing execute the script as follows :

hduser@ubuntu:~/queries$ hive -f queries.hive

Since hive internally uses Hadoop for keeping its data and Mapreduce for execution. Hadoop services should be started while executing hive commands.
In the add jar statement Jar's completed path should be specified and each jar should be specified separately as follows

add jar <PATH_TO_JAR>/esri-geometry-api.jar;
add jar <PATH_TO_JAR>/spatial-sdk-hive-1.0-MODIFIED.jar;
add jar <PATH_TO_JAR>/HiveUDFs.jar;

Not able to Execute hive script

1 Answers