0
votes

I am using remote mysql metastore for hive. when i run hive client it runs perfect. but when i try to use spark-sql either via spark-shell or by spark-submit i am not able to connect to hive. & getting following error :

    Caused by: javax.jdo.JDOFatalInternalException: Error creating transactional connection factory
Caused by: java.lang.NoClassDefFoundError: Could not initialize class org.apache.derby.jdbc.EmbeddedDriver

I am not getting why spark tries to connect derby database while i am using mysql database for metastore.

i am using apache spark version 1.3 & cloudera version CDH 5.4.8

1

1 Answers

0
votes

It seems spark is using default hive settings, follow these steps:

  • Copy or create soft-link of hive-site.xml to your SPARK_HOME/conf folder.
  • Add hive lib path to classpath in SPARK_HOME/conf/spark-env.sh
  • Restart the Spark cluster for everything to take effect.

I believe your hive-site.xml has location of MYSQL metastore? if not, follow these steps and restart spark-shell:

<property>
    <name>javax.jdo.option.ConnectionURL</name>
    <value>jdbc:mysql://MYSQL_HOST:3306/hive_{version}</value>
    <description>JDBC connect string for a JDBC metastore</description>
</property>
<property>
    <name>javax.jdo.option.ConnectionDriverName</name>
    <value>com.mysql.jdbc.Driver</value>
    <description>Driver class name for a JDBC metastore/description>
</property>
<property>
    <name>javax.jdo.option.ConnectionUserName</name>
    <value>XXXXXXXX</value>
    <description>Username to use against metastore database/description>
</property> 
<property>
    <name>javax.jdo.option.ConnectionPassword</name>
    <value>XXXXXXXX</value>
    <description>Password to use against metastore database/description>
</property>