0
votes

I am using org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil for deleting data from Hbase table. Wrote a main class (RollbackHandler)and starting job from there:

 def main(args: Array[String]) {
    val config = HBaseConfiguration.create()
    val job = new Job(config, "RollbackHandler")
    job.setJarByClass(classOf[RollBackMapper])
    //doing some creating filter related stuff,
    //creating scan etc.
    //......
    //.....

    TableMapReduceUtil.initTableMapperJob(tableName, scan, classOf[RollBackMapper], null, null, job)
        job.setOutputFormatClass(classOf[NullOutputFormat[_ <: Writable, _ <: Writable]])
        job.setNumReduceTasks(0)

        logger.info("Starting RollbackHandler job for HBASE table: " + tableName)
        val status = job.waitForCompletion(true)
        exitStatus = if (status) 0 else 1

}

Now running this as following:

java -classpath /opt/reflex/opt/tms/java/crux2.0-care1.0-jar-with-dependencies.jar:/opt/reflex/opt/tms/java/care-insta-api.jar:/opt/reflex/opt/tms/java/:/opt/reflex/opt/tms/java/care-acume-war/WEB-INF/lib/ RollbackHandler(fully_qualified_name_of_class)

This runs fine when mapreduce job launched in local mode. For running on yarn, added following lines in main() method:

config.set("mapreduce.framework.name", "yarn")
config.addResource(new Path("/opt/hadoop/conf/hdfs-site.xml"))
config.addResource(new Path("/opt/hadoop/conf/mapred-site.xml"))
config.addResource(new Path("/opt/hadoop/conf/yarn-site.xml"))

When running this, application launched on yarn but failed with following error:

Diagnostics:
Application application_1502881193709_0090 failed 2 times due to AM Container for appattempt_1502881193709_0090_000002 exited with exitCode: -1000 For more detailed output, check application tracking page:http://RPM-VIP:8088/cluster/app/application_1502881193709_0090Then, click on links to logs of each attempt. Diagnostics: java.io.IOException: Resource file:/opt/reflex/opt/tms/java/crux2.0-care1.0-jar-with-dependencies.jar changed on src filesystem (expected 1476799531000, was 1476800106000

Failing this attempt. Failing the application.

I thought it's a classpath issue, so created a archive of all the jars and added following line in main method: job.addArchiveToClassPath(new Path("/opt/reflex/jar_archive.tar.gz"))

But still application failing with the same error. Can someone help? Your help is highly appreciated!

Thanks, Suresh

1
Hadoop version being used is 2.7.1 .suresh

1 Answers

0
votes

Added all xml, available in hadoop conf dir as:

config.addResource(new Path("/opt/hadoop/conf/hdfs-site.xml"))
config.addResource(new Path("/opt/hadoop/conf/mapred-site.xml"))
config.addResource(new Path("/opt/hadoop/conf/core-site.xml"))
config.addResource(new Path("/opt/hadoop/conf/yarn-site.xml"))
config.addResource(new Path("/opt/hadoop/conf/capacity-scheduler.xml"))
config.addResource(new Path("/opt/hadoop/conf/hadoop-policy.xml"))

Also, copied habse-site.xml to hadoop classpath and restarted yarn. Added hbase-site.xml to config as following:

config.addResource(new Path("/opt/hadoop/conf/hbase-site.xml"))

Added .properties files to job object as following:

job.addFileToClassPath(new Path("/opt/hadoop/conf/hadoop-metrics2.properties"))
job.addFileToClassPath(new Path("/opt/hadoop/conf/hadoop-metrics.properties"))
job.addFileToClassPath(new Path("/opt/hadoop/conf/httpfs-log4j.properties"))
job.addFileToClassPath(new Path("/opt/hadoop/conf/log4j.properties"))

Also this path is read from hdfs, so make sure "/opt/hadoop/conf" used above is hdfs path. I copied /opt/hadoop/conf from local file system to hdfs. Job ran successfully on yarn after that.