Spark cluster: Standalone mode without HDFS

Question

We have a standalone Spark cluster. With a cluster, if the RDD memory storage is not enough, it spills the data to disk. Where exactly is the data spilled to when there is no HDFS? Local disk of each slave node?

Thanks!

Unknown Unknown · Accepted Answer · 2015-12-17T00:32:28

As far as I know all data is spilled to the local directory defined by spark.local.dir independent of HDFS access.

Spark cluster: Standalone mode without HDFS

1 Answers