How to spark-submit hiveContext which is written by IDE？

Question

I am trying to deploy my codes which contains hiveContext on Spark Cluster.

./spark-submit --class com.dt.sparkSQL.DataFrameToHive --master spark://SparkMaster:7077 /root/Documents/DataFrameToHive.jar But here is the problem

17/08/13 10:29:46 INFO hive.metastore: Trying to connect to metastore with URI thrift://SparkMaster:9083
17/08/13 10:29:46 WARN hive.metastore: Failed to connect to the MetaStore Server...
17/08/13 10:29:46 INFO hive.metastore: Waiting 1 seconds before next connection attempt.
Exception in thread "main" java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient

When I did the spark-shell

./spark-shell  --master spark://SparkMaster:7077

I can connect with SparkMaster:9083 successfully.Here is my spark/conf/hive-site.xml

<configuration>
<property>
        <name>hive.metastore.uris</name>
        <value>thrift://SparkMaster:9083</value>
        <description>thrift URI for the remote metastore.Used by metastore client to connect to remote metastore. </description>
</property>
</configuration>

My question is why it will connect with SparkMaster:9083 when I do the spark-submit and what is the problem with SparkMaster:9083? Here are the codes on IDE

package com.dt.sparkSQL

import org.apache.spark.{SparkConf, SparkContext}
import org.apache.spark.sql.SQLContext
import org.apache.spark.sql.hive.HiveContext
object DataFrameToHive {
  def main(args: Array[String]): Unit = {
    val conf = new SparkConf()
    conf.setAppName("DataFrameToHive").setMaster("spark://SparkMaster:7077")
    val sc = new SparkContext(conf)
    val hiveContext = new HiveContext(sc)
    hiveContext.sql("use userdb")
    hiveContext.sql("DROP TABLE IF EXISTS people")
    hiveContext.sql("CREATE TABLE IF NOT EXISTS people(name STRING, age INT)ROW FORMAT DELIMITED FIELDS TERMINATED BY '\\t' LINES TERMINATED BY '\\n'")
    hiveContext.sql("LOAD DATA LOCAL INPATH '/root/Documents/people.txt' INTO TABLE people")
    hiveContext.sql("use userdb")
    hiveContext.sql("DROP TABLE IF EXISTS peopleScores")
    hiveContext.sql("CREATE TABLE IF NOT EXISTS peopleScores(name STRING, score INT) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\\t' LINES TERMINATED BY '\\n'")
    hiveContext.sql("LOAD DATA LOCAL INPATH '/root/Documents/peopleScore.txt' INTO TABLE peopleScores")
    val resultDF = hiveContext.sql("select pi.name,pi.age,ps.score "
      +" from people pi join peopleScores ps on pi.name=ps.name"
      +" where ps.score>90")
    hiveContext.sql("drop table if exists peopleResult")
    resultDF.saveAsTable("peopleResult")
    val dataframeHive = hiveContext.table("peopleResult")
    dataframeHive.show()
  }
}
`

Jason Shu Jason Shu · Accepted Answer · 2017-08-14T01:57:41

I have successfully solved this question.Deploying hiveContext is kind of different from regular jars.

./spark-submit  --class com.dt.sparkSQL.DataFrameToHive --files /usr/local/hive/apache-hive-1.2.1-bin/conf/hive-site.xml   --master spark://SparkMaster:7077  /root/Documents/DataFrameToHive.jar

How to spark-submit hiveContext which is written by IDE？

1 Answers