0
votes

We have a standalone Spark cluster. With a cluster, if the RDD memory storage is not enough, it spills the data to disk. Where exactly is the data spilled to when there is no HDFS? Local disk of each slave node?

Thanks!

1

1 Answers

1
votes

As far as I know all data is spilled to the local directory defined by spark.local.dir independent of HDFS access.