I'm trying to write the contents of a dataframe to an existing partitioned managed Hive table like so:
outputDF.write.mode("Overwrite").insertInto(targetTable)
The target table is ORC and I'm looking to preserve that. Using saveAsTable would drop and recreate the table as parquet (see here: What are the differences between saveAsTable and insertInto in different SaveMode(s)?).
Trouble is for some of my tables I need the whole table data overwritten (similar to a truncate), not just specific partitions.
It is unclear to me whether overwrite is supported in this context and, if yes, what am I doing wrong. SparkSession sets the following configuration values:
.config("spark.sql.sources.partitionOverwriteMode", "static"/"dynamic")
.config("hive.exec.dynamic.partition", "true")
.config("hive.exec.dynamic.partition.mode", "nonstrict")
Am I missing something?
Also, I suspect this can be achieved through the SQL API but I am trying to avoid it.
SD_