Running Spark App: Persist Metastore

Question

I work on a Spark 2.1 application that also uses SparkSQL and saves data with dataframe.write.saveAsTable(tbl). My understanding is that an in-memory Derby DB is used for the Hive metastore (right?). This means that a table that I create in the first execution is not available in any subsequent executions. In many cases that might be the intended behaviour - but I would like to persist the metastore across executions (since this is also the behavior I have in my production system).

So, a simple question: How can I change the configuration to persist the metastore on disc?

One remark: I am not starting the Spark job with spark-shell or spark-submit, but as a standalone Scala application.

user8371439 user8371439 · Accepted Answer · 2017-07-26T16:36:08

It is already persisted on disk. As long as both sessions use the same working directory or specific metastore configuration, the permanent table will be persisted between sessions.

Running Spark App: Persist Metastore

1 Answers