2
votes

I am trying to push incremental data into a base hive table. To do that, I prepared the data in my staging database and stored it in a table with exact ddl of my base table. In order to move the data from staging to base, I am trying the "Exchange partition" on the hive table from spark.sql as below.

Alter table base.basetable drop partition (vehicle='BIKE');
ALTER TABLE base.basetable EXCHANGE PARTITION (vehicle='BIKE') WITH TABLE staging.stagingtable;

But I see the exception:

org.apache.spark.sql.catalyst.parser.ParseException:
Operation not allowed: ALTER TABLE EXCHANGE PARTITION(line 1, pos 0)
== SQL ==
ALTER TABLE base.basetable (vehicle='BIKE') WITH TABLE staging.stagingtable
^^^

This looks like spark sql doesn't support "Exchange partition" even though I have set enableHiveSupport() in my program:

val spark  = SparkSession.builder().config(conf).master("yarn").enableHiveSupport().config("hive.exec.dynamic.partition", "true").config("hive.exec.dynamic.partition.mode", "nonstrict").getOrCreate()

Shouldn't enableHiveSupport() give me all support & access for Hive querying ?

I can do insert overwrite base.basetable partition(vehicle) select * from staging.stagingtable where vehicle='BIKE'

Some of our tables have huge data and if the feature "Exchange partition" doesn't work in spark sql. Could anyone let me know how can I exchange the partition in this case ?

1
how can I exchange the partition in this case - how about executing hive query instead of spark? (via beeline for example) - shay__
My entire program is written in spark. Any suggestions in this case ? - Sidhartha
this should not be a problem, you can execute hive command from scala (JDBC /plain bash command). as said in the answer below - you cannot do this with Spark. - shay__
for example: import sys.process._; Seq("hive", "-e", "alter table...") !! - shay__
@shay__ How about I write a plain old scala-jdbc class that connects to Hive using its connection props. Then Drop & Exchange the partition ? Is it what you are suggesting ? - Sidhartha

1 Answers

0
votes

The error message tells you the answer. Such operations are NOT supported via spark sql.