The RDD, which have been cached used the rdd.cache() method from the scala terminal, are being stored in the memory.
That means it will consume some part of the ram being available for the Spark process itself.
Having said that if the ram is being limited, and more and more RDDs have been cached, when will spark clean the memory automatically which has been occupied by the rdd cache?
.unpersist()
: see stackoverflow.com/questions/25938567/how-to-uncache-rdd – Zouzias