
I have Created a Spark Context object , and tried retrieving text from text file on hadoop server (Not in my local) and I was able to retrieve it.

When I'm trying to retrieve Hive Table (which is on a stand alone machine, cluster) I was unable to do, and when I created a hive table its getting created locally in metastore_db

objHiveContext.sql("create table yahoo_orc_table (date STRING, open_price FLOAT, high_price FLOAT, low_price FLOAT, close_price FLOAT, volume INT, adj_price FLOAT) stored as orc")

I tried setting the metastore

objHiveContext.setConf("hive.metastore.warehouse.dir", "hdfs://ServerIP:HiveportNum/apps/hive/warehouse")

&& also objHiveContext.hql("SET hive.metastore.warehouse.dir=hdfs://serverIp:portNumber/apps/hive/warehouse")

I even placed hive-site xml in spark machine conf folder ,

How to make my scala application to contact hive-site.xml and take metastore info from that xml and where should I place my Hive-site.xml

I have placed it in my application as everywhere it is suggested to add in ClassPath , I added and can see that just above mypom.xml file, but still my scala app is in local mode

Tables(yahoo_orc_table ) are Created Locally in D:\user\hive\warehouse


2 Answers


The only place where it should be is in the spark conf directory. If you put it there and still things are not working, that means that the problem is somewhere else, maybe in the contents of hive-site.xml.


This problem was solved on spark2, after placing hive-site xml file in spark machine conf folder you can use :

  import org.apache.spark.sql.SparkSession
val spark = SparkSession
.appName("interfacing spark sql to hive metastore without configuration file")
.config("hive.metastore.uris", "thrift://host:port") // replace with your hivemetastore service's thrift url
.enableHiveSupport() // don't forget to enable hive support

spark.sql("create table yahoo_orc_table (date STRING, open_price FLOAT, high_price FLOAT, low_price FLOAT, close_price FLOAT, volume INT, adj_price FLOAT) stored as orc")

this code create table "yahoo_orc_table" in hive on cluster.