1
votes

I'm running spark jobs on a standalone cluster (generated using spark-ec2 1.5.1) using crontab and my worker nodes are getting hammered by these app files that get created by each job.

java.io.IOException: Failed to create directory /root/spark/work/app-<app#>

I've looked at http://spark.apache.org/docs/latest/spark-standalone.html and changed my spark-env.sh (located in spark/conf on the master and worker nodes) to reflect the following:

SPARK_WORKER_OPTS="-Dspark.worker.cleanup.enabled=true -Dspark.worker.cleanup.appDataTtl=3600"

Am I doing something wrong? I've added the line to the end of each spark-env.sh file on the master and both workers.

On maybe a related note, what are these mounts pointing to? I would use them, but I don't want to use them blindly.

Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/xvda1             8256952   8256952         0 100% /
tmpfs                  3816808         0   3816808   0% /dev/shm
/dev/xvdb            433455904   1252884 410184716   1% /mnt
/dev/xvdf            433455904    203080 411234520   1% /mnt2
1
How long do the applications run? If you've got lots of jobs running at the same time, the cleanup won't happen since it is only for stopped applications. Can you describe your load? - Jacek Laskowski
I have two cron jobs running each hour separated by 30 minutes and on average each one takes 10 minutes or so, each cron job has two separate spark-submit jobs, so 4 jobs an hour, each job taking 5 minutes each with no overlap since each cron job runs the spark-submit jobs in serial. I'm not sure the jobs are blocking the cleanup, I think it's more so that the cleanup isn't ever getting triggered. - jackar

1 Answers

0
votes

Seems like a 1.5.1 issue - I'm no longer using the spark-ec2 script to spin up the cluster. Ended up creating a cron job to clear out the directory as mentioned in my comment.