just a quick question. I'm trying to execute a Spark program with a version of 1.6.0 that utilizes concurrent loading on a Hive Table. Is using an insert statement in the hiveContext.sql("insert . . .") a way to go since I want to ensure table locking during the writing process because from what I've seen in the Spark documentation table locking and atomicity are not ensured when using Saving operations with a DataFrame.
"Save operations can optionally take a SaveMode, that specifies how to handle existing data if present. It is important to realize that these save modes do not utilize any locking and are not atomic. Additionally, when performing a Overwrite, the data will be deleted before writing out the new data."
How can I ensure the atomicity or locking of a hive table in spark whenever accessing/inserting a data in the specified hive table?
Any suggestions are plenty helpful. Thank you very much.