spark tasks fail with error, showing exit status: -100

Question

The spark job running in yarn mode, shows few tasks failed with following reason:

ExecutorLostFailure (executor 36 exited caused by one of the running tasks) Reason: Container marked as failed: container_xxxxxxxxxx_yyyy_01_000054 on host: ip-xxx-yy-zzz-zz. Exit status: -100. Diagnostics: Container released on a *lost* node

Any idea why is this happening?

DennisLi DennisLi · Accepted Answer · 2019-05-17T08:12:57

There are two main reasons.

It is may because of your memoryOverhead needed by the yarn container is not enough, and the solution is to Increase the spark.executor.memoryOverhead
Possibly, it is because the slave node disk lack space to write tmp data required by spark. check your yarn usercache dir (for EMR, it locates on /mnt/yarn/usercache/),
or type df -h to check your disk remaining space.

spark tasks fail with error, showing exit status: -100

3 Answers