0
votes

I am new to hadoop.

I am trying to setup Giraph to run on hadoop-2.6.5 with yarn.

When I submit the Giraph job the job gets submitted successfully but fails and I get below log in container syslog:

2018-01-30 12:09:01,190 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Created MRAppMaster for application appattempt_1517293264136_0002_000002 2018-01-30 12:09:01,437 WARN [main] org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2018-01-30 12:09:01,471 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Executing with tokens: 2018-01-30 12:09:01,471 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Kind: YARN_AM_RM_TOKEN, Service: , Ident: (appAttemptId { application_id { id: 2 cluster_timestamp: 1517293264136 } attemptId: 2 } keyId: -1485907628) 2018-01-30 12:09:01,583 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Using mapred newApiCommitter. 2018-01-30 12:09:02,154 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: OutputCommitter set in config null 2018-01-30 12:09:02,207 FATAL [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Error starting MRAppMaster java.lang.NoClassDefFoundError: io/netty/buffer/ByteBufAllocator at org.apache.giraph.bsp.BspOutputFormat.getOutputCommitter(BspOutputFormat.java:62) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.call(MRAppMaster.java:470) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.call(MRAppMaster.java:452) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.callWithJobClassLoader(MRAppMaster.java:1541) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.createOutputCommitter(MRAppMaster.java:452) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceInit(MRAppMaster.java:371) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$4.run(MRAppMaster.java:1499) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1692) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1496) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1429) Caused by: java.lang.ClassNotFoundException: io.netty.buffer.ByteBufAllocator at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ... 13 more 2018-01-30 12:09:02,209 INFO [main] org.apache.hadoop.util.ExitUtil: Exiting with status 1

Diagnosis in logs shows following log:

Application application_1517293264136_0002 failed 2 times due to AM Container for appattempt_1517293264136_0002_000002 exited with exitCode: 1 For more detailed output, check application tracking page:http://172.16.0.218:8088/proxy/application_1517293264136_0002/Then, click on links to logs of each attempt. Diagnostics: Exception from container-launch. Container id: container_1517293264136_0002_02_000001 Exit code: 1 Stack trace: ExitCodeException exitCode=1: at org.apache.hadoop.util.Shell.runCommand(Shell.java:575) at org.apache.hadoop.util.Shell.run(Shell.java:478) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:766) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:212) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Container exited with a non-zero exit code 1 Failing this attempt. Failing the application.

The class it is failing at is io/netty/buffer/ByteBufAllocator which is in netty-all jar: https://mvnrepository.com/artifact/io.netty/netty-all

From other questions I have tried adding the jar in HADOOP_CLASSPATH.

Yogin-Patel:hadoop yoginpatel$ echo $HADOOP_CLASSPATH
/Users/yoginpatel/Downloads/gradle-4.3/caches/modules-2/files-2.1/io.netty/netty-all/4.0.43.Final/9781746a179070e886e1fb4b1971a6bbf02061a4/netty-all-4.0.43.Final.jar
Yogin-Patel:hadoop yoginpatel$ 

It shows up in hadoop classpath as well.

Yogin-Patel:hadoop yoginpatel$ hadoop classpath
/Users/yoginpatel/hadoop/etc/hadoop:/Users/yoginpatel/hadoop/share/hadoop/common/lib/*:/Users/yoginpatel/hadoop/share/hadoop/common/*:/Users/yoginpatel/hadoop/share/hadoop/hdfs:/Users/yoginpatel/hadoop/share/hadoop/hdfs/lib/*:/Users/yoginpatel/hadoop/share/hadoop/hdfs/*:/Users/yoginpatel/hadoop/share/hadoop/yarn/lib/*:/Users/yoginpatel/hadoop/share/hadoop/yarn/*:/Users/yoginpatel/hadoop/share/hadoop/mapreduce/lib/*:/Users/yoginpatel/hadoop/share/hadoop/mapreduce/*:/Users/yoginpatel/Downloads/gradle-4.3/caches/modules-2/files-2.1/io.netty/netty-all/4.0.43.Final/9781746a179070e886e1fb4b1971a6bbf02061a4/netty-all-4.0.43.Final.jar:/contrib/capacity-scheduler/*.jar
Yogin-Patel:hadoop yoginpatel$ 

I am trying to setup in development environment. This is single node setup.

I have even tried

job.addFileToClassPath(new Path("/Users/yoginpatel/Downloads/gradle-4.3/caches/modules-2/files-2.1/io.netty/netty-all/4.0.43.Final/9781746a179070e886e1fb4b1971a6bbf02061a4/netty-all-4.0.43.Final.jar"));

None of the approaches helped. How do I make hadoop node get the necessary jar accessed?

This is a GiraphJob submit code which would submit map reduce job to the cluster:

    @Test
    public void testPageRank() throws IOException, ClassNotFoundException, InterruptedException {

        GiraphConfiguration giraphConf = new GiraphConfiguration(getConf());
        giraphConf.setWorkerConfiguration(1,1,100);
        GiraphConstants.SPLIT_MASTER_WORKER.set(giraphConf, false);

        giraphConf.setVertexInputFormatClass(JsonLongDoubleFloatDoubleVertexInputFormat.class);
        GiraphFileInputFormat.setVertexInputPath(giraphConf,
                                                 new Path("/input/tiny-graph.txt"));
        giraphConf.setVertexOutputFormatClass(IdWithValueTextOutputFormat.class);

        giraphConf.setComputationClass(PageRankComputation.class);

        GiraphJob giraphJob = new GiraphJob(giraphConf, "page-rank");
        giraphJob.getInternalJob().addFileToClassPath(new Path("/Users/yoginpatel/Downloads/gradle-4.3/caches/modules-2/files-2.1/io.netty/netty-all/4.0.43.Final/9781746a179070e886e1fb4b1971a6bbf02061a4/netty-all-4.0.43.Final.jar"));

        FileOutputFormat.setOutputPath(giraphJob.getInternalJob(),
                                       new Path("/output/page-rank2"));
        giraphJob.run(true);
    }

    private Configuration getConf() {
        Configuration conf = new Configuration();
        conf.set("fs.defaultFS", "hdfs://localhost:9000");

        conf.set("yarn.resourcemanager.address", "localhost:8032");

        // framework is now "yarn", should be defined like this in mapred-site.xm
        conf.set("mapreduce.framework.name", "yarn");
        return conf;
    }
1

1 Answers

0
votes

I got it working by putting giraph's jar with dependencies in the hadoop lib path:

cp giraph-1.3.0-SNAPSHOT-for-hadoop-2.6.5-jar-with-dependencies.jar ~/hadoop/share/hadoop/mapreduce/lib/