I am running some Map-Reduce-Jobs on an aws emr cluster with ~10 Nodes. (emr 4.7.11, m3.xlarge)
While the job is running the worker nodes start to die one by one after ~4 hours. In the logs I found the following error:
"1/3 local-dirs are bad: /mnt/yarn; 1/1 log-dirs are bad: /var/log/hadoop-yarn/containers"
The disks on the worker nodes were at 96% used when the Nodes failed. So I assume the disks on the nodes got to 100% and no files could be written to the disk.
So I tried to attach an 500GB EBS Volume to each instance. But Hadoop only uses /mnt
and does not use the additional Volume (/mnt2
).
How do i configure the AWS EMR Cluster to use /mnt2
?
I've tried to use a configuration file, but the cluster fails now with the error On the master instance (i-id), bootstrap action 6 returned a non-zero
on bootstrap.
unfortunately there are bootstrap action 6 log in the s3 bucket
The config file:
[
{
"Classification": "core-site",
"Properties": {
"hadoop.tmp.dir": "/mnt2/var/lib/hadoop/tmp"
}
},
{
"Classification": "mapred-site",
"Properties": {
"mapred.local.dir": "/mnt2/var/lib/hadoop/mapred"
}
}
]
Anyone has a hint why the cluster fails on startup ? Or is there another way to increase the initial EBS Volume of the m3.xlarge instances ?
https://forums.aws.amazon.com/thread.jspa?threadID=225588 Looks like the same issue but there is no solution