Resource Isolation, Apache Spark YARN and Standalone, with regards to running on VM in Cloud

Question

For Spark YARN and Standalone resource allocation is achieved for example via spark-submit. This limits for example that containers are strictly limited to the memory requested. This is done with using CGroups a linux kernel feature. But how is the research isolation done for YARN and standalone? Here I am especially curious how the in-memory computation is secured when running your VM on a server shared with users you might not know, for example when running on the machine of a cloud provider.

For example when using YARN how does YARN make sure that the containers aren’t affected by other containers on the machine? So how does it make sure when assigned 1 GB of memory this GB is not used by others on the machine? So how does Spark make sure there is no memory leak and a evil user running another application, or even another VM cannot obtain data from the memory.

Huh... you imply that a Linux process can access the memory allocated to any other Linux process, and sneak into someone else's data?!? — Samson Scharfrichter
Or do you imply that the RAM is not cleaned up when the process ends, and another process can allocate some memory and inspect the bits to guess how objects have been serialized there? — Samson Scharfrichter
The reason is due to the DRAM leak published in 2014, link. The idea of this leak is that you can cause bit flips in adjacent rows and thus perhaps corrupt someone else his data. Since by using YARN it is ensured that they run isolated I am curious how it is achieved and how something like this is prevented. More easier to read information can be found at googleprojectzero.blogspot.de/2015/03/… . So I am wondering when exploiting this if you could access the memory of Spark or YARN. — Paul Velthuis
You are refering to a hardware vulnerability -- the OS cannot protect you from that, because it would have to know the exact 3D topology of its RAM chips, and over-allocate some blank pages "between" pages actually allocated for processes. Which is insane because memory gets very fragmented in practise so the unused pages would rapidly eat up 50% of the RAM or more; and each chip has different characteristics. — Samson Scharfrichter
Java does not control how/when its memory pages get moved, swapped to/from disk, etc. and YARN does not either. So IMHO your question does not make sense. If you are afraid that some RAM chips have specific vulnerabilities that can affect unknown servers in unknown clouds with non-deteministic effects, then I have a solution for you: join the Amish, and live away from all computers. — Samson Scharfrichter

RBanerjee RBanerjee · Accepted Answer · 2017-03-04T08:41:30

Edited::
Spark or any other application when manage resource via Yarn, It becomes Yarn's responsibility to ensure that resource. Now

how does YARN make sure that the containers aren’t affected by other containers on the node.?

Check here for detailed Yarn memory allocation explanation.

Yarn containers are JVM processes, so while launching the containers NM specifies jvm-opts for restricting the VM memory and then there is a component called ContainersMonitor in NodeManager, which monitors the usage of total memory usage of the process and sends a kill signal if the process trying to consume more resource.

is NM's ContainerMonitor using CGGroup for monitoring CPU and Memory ?

As per official documentation:Using CGroups with YARN CGroups is a mechanism for aggregating/partitioning sets of tasks, and all their future children, into hierarchical groups with specialized behaviour. CGroups is a Linux kernel feature and was merged into kernel version 2.6.24. From a YARN perspective, this allows containers to be limited in their resource usage. A good example of this is CPU usage. Without CGroups, it becomes hard to limit container CPU usage. Currently, CGroups is only used for limiting CPU usage.

For Memory , It's coming on Hadoop 3. Refer the JIRA here

How is it made sure that the memory is only used for this application?

For the allocated memory to JVM process, JVM ensures it throws out of memory exception for heap , and in total the NM's container monitor does the monitoring and killing.

can’t be used by another application?

Admin ensures . Ha ha ha, NO ONE is allowed to login to the worker nodes apart from few admins in our case.

Now coming to the planning , suppose you have 64 GB RAM in each worker/datanode machine , No one is allowed to login to run any custom code , so only required services(linux and yarn services) are running. Which are taking max 10 GB, So you decided to give Yarn rest of the 48 GB.

Now while launching containers Yarn will tell NM to allocate max 4GB per container (out which a percentage will be allotted as actual JVM's heap as per the settings), which will ensure min 12 Happy containers.

And then,If all jobs request 1 GB per container, YARN will be able to stuff 48 containers. (Thanks @Samson Scharfrichter)

Resource Isolation, Apache Spark YARN and Standalone, with regards to running on VM in Cloud

1 Answers