I am using Hadoop 3.2.0 and trying to run a simple application in a docker container and I have made the required configuration changes both in yarn-site.xml and container-executor.cfg to choose LinuxContainerExecutor and docker runtime.
I use the example of distributed shell in one of the hortonworks blog. https://hortonworks.com/blog/trying-containerized-applications-apache-hadoop-yarn-3-1/
The problem I face here is when the application is submitted to YARN it fails with a reason related to directory creation issue with the below error
2019-02-14 20:51:16,450 INFO distributedshell.Client: Got application report from ASM for, appId=2, clientToAMToken=null, appDiagnostics=Application application_1550156488785_0002 failed 2 times due to AM Container for appattempt_1550156488785_0002_000002 exited with exitCode: -1000 Failing this attempt.Diagnostics: [2019-02-14 20:51:16.282]Application application_1550156488785_0002 initialization failed (exitCode=20) with output: main : command provided 0 main : user is myuser main : requested yarn user is myuser Failed to create directory /data/yarn/local/nmPrivate/container_1550156488785_0002_02_000001.tokens/usercache/myuser - Not a directory
I have configured yarn.nodemanager.local-dirs in yarn-site.xml and I can see the same reflected in YARN web ui localhost:8088/conf
<property>
<name>yarn.nodemanager.local-dirs</name>
<value>/data/yarn/local</value>
<final>false</final>
<source>yarn-site.xml</source>
</property>
I do not understand why is it trying to create usercache dir inside the nmPrivate directory.
Note : I have verified the permissions for myuser to the directories and also have tried clearing the directories manually as suggested in a related post. But no fruit. I do not see any additional information about container launch failure in any other logs.
How do I debug why the usercache dir is not resolved properly??
Really appreciate any help on this.