0
votes

I am trying to read a big hbase table in spark (~100GB in size).

Spark Version : 1.6

Spark submit parameters:

spark-submit --master yarn-client --num-executors 10  --executor-memory 4G 
             --executor-cores 4 
             --conf spark.yarn.executor.memoryOverhead=2048

Error: ExecutorLostFailure Reason: Container killed by YARN for exceeding limits. 4.5GB of 3GB physical memory used limits. Consider boosting spark.yarn.executor.memoryOverhead.

I have tried setting spark.yarn.executor.memoryOverhead to 100000. Still getting similar error.

I don't understand why spark doesn't spill to disk if the memory is insufficient OR is YARN causing the problem here.

1

1 Answers

0
votes

Please share your code how you try to read in. And also your cluster architecture

Container killed by YARN for exceeding limits. 4.5GB of 3GB physical memory used limits

Try

spark-submit 
--master yarn-client 
--num-executors 4  
--executor-memory 100G
--executor-cores 4 
--conf spark.yarn.executor.memoryOverhead=20480

If you have 128 gRam

The situation is clear, you run out of ram, try to rewrite your code in a disk friendly way.