7
votes

I use Hive to create table store sequencefile. Row format is serder class myserde.TestDeserializer in hiveserde-1.0.jar

In the command line I use this command to add the jar file:

hive ADD JAR hiveserde-1.0.jar

Then I create a table, the file loads successfully.

But now I want to run it and create a table on the client by using mysql jdbc. The error is :

SerDe: myserde.TestDeserializer does not exist.

How to run it ? Thanks

4

4 Answers

11
votes

So, there are a few options. In all of them the jar needs to be present on your cluster with Hive installed. The JDBC client code, of course, can be run from anywhere within or outside of the cluster.

Option 1: You issue a HQL query before you run any of your other HQL commands:

ADD JAR hiveserde-1.0.jar

Option 2: You can update your hive-site.xml to have the hive.aux.jars.path property set to the complete path to your jar hiveserde-1.0.jar

1
votes

Go to your hive-env.sh and append to the bottom of the file:

export HIVE_AUX_JARS_PATH=$HIVE_AUX_JARS_PATH:/<path-to-jar>

You can then source this file. Not ideal, but it works.

0
votes

Are you saying that you'd like to create table by jdbc rather than doing in CLI ? In that case, you should add the jar to your classpath when you run your jdbc code.

0
votes

Yes this can be a little bit confusing, it seems half the time Hive is reading from the cluster and the other half from the local file system (machine Hive server is installed).

To overcome this simple copy the .jar file to the Hive server machine and you can then reference this in your Hive query for example:

add jar /tmp/json-serde.jar;

create table tweets (
    name string,
    address1 string,
    address2 string,
    address3 string,
    postcode string
)
...

And then onto the next problem ;)