0
votes

I am using Hadoop 3.2.0 and trying to run a simple application in a docker container and I have made the required configuration changes both in yarn-site.xml and container-executor.cfg to choose LinuxContainerExecutor and docker runtime.

I use the example of distributed shell in one of the hortonworks blog. https://hortonworks.com/blog/trying-containerized-applications-apache-hadoop-yarn-3-1/

The problem I face here is when the application is submitted to YARN it fails with a reason related to directory creation issue with the below error

2019-02-14 20:51:16,450 INFO distributedshell.Client: Got application report from ASM for, appId=2, clientToAMToken=null, appDiagnostics=Application application_1550156488785_0002 failed 2 times due to AM Container for appattempt_1550156488785_0002_000002 exited with exitCode: -1000 Failing this attempt.Diagnostics: [2019-02-14 20:51:16.282]Application application_1550156488785_0002 initialization failed (exitCode=20) with output: main : command provided 0 main : user is myuser main : requested yarn user is myuser Failed to create directory /data/yarn/local/nmPrivate/container_1550156488785_0002_02_000001.tokens/usercache/myuser - Not a directory

I have configured yarn.nodemanager.local-dirs in yarn-site.xml and I can see the same reflected in YARN web ui localhost:8088/conf

<property>
    <name>yarn.nodemanager.local-dirs</name>
    <value>/data/yarn/local</value>
    <final>false</final>
    <source>yarn-site.xml</source>
</property>

I do not understand why is it trying to create usercache dir inside the nmPrivate directory.

Note : I have verified the permissions for myuser to the directories and also have tried clearing the directories manually as suggested in a related post. But no fruit. I do not see any additional information about container launch failure in any other logs.

How do I debug why the usercache dir is not resolved properly??

Really appreciate any help on this.

1

1 Answers

0
votes

Realized that this is all because of the users the services were started with and the permissions to the directories the services work on.

After making sure the required changes are done, I am able to seamlessly run the examples and other applications..

Thanks Hadoop user community for the direction. Adding the link here for more details.

http://mail-archives.apache.org/mod_mbox/hadoop-user/201902.mbox/browser