I am facing a problem: I can't access Hive tables from Spark, using spark-submit, while I can with the pyspark shell. Here is the piece of code:
from pyspark.sql import SparkSession, HiveContext
spark = SparkSession \
.builder \
.appName("Python Spark SQL Hive integration example") \
.enableHiveSupport() \
.getOrCreate()
spark.sql("SHOW TABLES").show()
Here is the result with pyspark (shell):
+--------+-------------+-----------+
|database| tableName|isTemporary|
+--------+-------------+-----------+
| default| table1| false|
| default| table2| false|
+--------+-------------+-----------+
Here is the result with spark-submit:
+--------+---------+-----------+
|database|tableName|isTemporary|
+--------+---------+-----------+
+--------+---------+-----------+
I tried to add spark conf directory to the classpath, to add a "--files" with hive-site.xml, I tried also with Hivecontext, and got the same results. i tried with scala : same results.
EDIT : I am not connecting to a remote Hive server, but on the same one