1
votes

I'm using the latest HDP Sandbox (2.4.0.0-169). I have written below code in Spark-shell (Spark Version 1.6.0)

var orcData = sqlContext.sql("select code from sample_07");
var paymentDataCache = orcData.cache;
paymentDataCache.registerTempTable("paymentDataCache");

Followed below commands to start thrift server and beeline

1) export SPARK_HOME=/usr/hdp/2.4.0.0-169/spark/
2) sudo ./sbin/start-thriftserver.sh --master yarn-client --executor-memory 512m --hiveconf hive.server2.thrift.port=10015
3) ./bin/beeline
4) !connect jdbc:hive2://localhost:10015

Now If I execute show tables, I'm expecting to see paymentDataCache temporary table. Please find attached screen shot.

I also tried to start the thrift server using

sudo ./sbin/start-thriftserver.sh --master yarn-client --executor-memory 512m --hiveconf hive.server2.thrift.port=10015 --conf spark.sql.hive.thriftServer.singleSession=true

but no luck.

We tried the same process in HDP (2.3.2.0-2950 with Spark 1.4.1) 9 node cluster but we do not see temporary tables in Spark beeline.

1

1 Answers

1
votes

When you register a temporary table, the table exists only in the Spark context in which it is created. Thus, when you start a new thriftserver the Spark context in which it runs is different from the one of your spark-shell and can't see the temporary table.

If you want to run a test, you can put in your spark-shell the following line of code:

org.apache.spark.sql.hive.thriftserver.HiveServer2.startWithContext(sqlContext)

It starts a new thriftserver using the Spark context that you pass in. In this way, the new thriftserver will be able to see the registered temporary table.