Where is Spark catalog metadata stored?

Question

Have been trying to get an accurate view of how Spark's catalog API stores the metadata.

I have found some resources, but no answer:

I see some tutorials that take for granted the existence of Hive Metastore.

Is Hive Metastore potentially included with Spark distribution?
Spark cluster can be short-lived, but Hive metastore would obviously need to be long-lived

Apart from the catalog feature, partitioning and sorting features when writing out a DF seem to depend on Hive... So "everyone" seems to take Hive as granted when talking about key Spark features of persisting a DF.

ShirishT ShirishT · Accepted Answer · 2018-10-12T06:32:49

Spark becomes aware of Hive MetaStore when it is provided with hive-site.xml, which is typically placed under $SPARK_HOME/conf. Whenever enableHiveSupport() method is used while creating SparkSession, Spark finds where and how to get connected with Hive metastore. Spark therefore does not explicitly stores hive settings.

Where is Spark catalog metadata stored?

1 Answers