1
votes

I have running a s3distcp job in AWS EMR hadoop 2.2.0 version. And the job keep failed with a failed reducer task after 3 attempts. I also tried both:

mapred.max.reduce.failures.percent
mapreduce.reduce.failures.maxpercent

to be 50 to the oozie hadoop action configuration and mapred-site.xml. But still the job failed.

And here are the logs:

2015-10-02 14:42:16,001 INFO [main] org.apache.hadoop.mapreduce.Job: Task Id : attempt_1443541526464_0115_r_000010_2, Status : FAILED 2015-10-02 14:42:17,005 INFO [main] org.apache.hadoop.mapreduce.Job: map 100% reduce 93% 2015-10-02 14:42:29,048 INFO [main] org.apache.hadoop.mapreduce.Job: map 100% reduce 98% 2015-10-02 15:04:20,369 INFO [main] org.apache.hadoop.mapreduce.Job: map 100% reduce 100% 2015-10-02 15:04:21,378 INFO [main] org.apache.hadoop.mapreduce.Job: Job job_1443541526464_0115 failed with state FAILED due to: Task failed task_1443541526464_0115_r_000010 Job failed as tasks failed. failedMaps:0 failedReduces:1

2015-10-02 15:04:21,451 INFO [main] org.apache.hadoop.mapreduce.Job: Counters: 45 File System Counters FILE: Number of bytes read=280 FILE: Number of bytes written=10512783 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=32185011 HDFS: Number of bytes written=0 HDFS: Number of read operations=170 HDFS: Number of large read operations=0 HDFS: Number of write operations=28 Job Counters Failed reduce tasks=4 Launched map tasks=32 Launched reduce tasks=18 Data-local map tasks=15 Rack-local map tasks=17 Total time spent by all maps in occupied slots (ms)=2652786 Total time spent by all reduces in occupied slots (ms)=65506584 Map-Reduce Framework Map input records=156810 Map output records=156810 Map output bytes=30892192 Map output materialized bytes=6583455 Input split bytes=3904 Combine input records=0 Combine output records=0 Reduce input groups=0 Reduce shuffle bytes=7168 Reduce input records=0 Reduce output records=0 Spilled Records=156810 Shuffled Maps =448 Failed Shuffles=0 Merged Map outputs=448 Failed Shuffles=0 Merged Map outputs=448 GC time elapsed (ms)=2524 CPU time spent (ms)=108250 Physical memory (bytes) snapshot=14838984704 Virtual memory (bytes) snapshot=106769969152 Total committed heap usage (bytes)=18048614400 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0 File Input Format Counters Bytes Read=32181107 File Output Format Counters Bytes Written=0 2015-10-02 15:04:21,451 INFO [main] com.amazon.external.elasticmapreduce.s3distcp.S3DistCp: Try to recursively delete hdfs:/tmp/218ad028-8035-4f97-b113-3cfea04502fc/tempspace 2015-10-02 15:04:21,515 INFO [main] org.apache.hadoop.io.compress.zlib.ZlibFactory: Successfully loaded & initialized native-zlib library 2015-10-02 15:04:21,516 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new compressor [.deflate] 2015-10-02 15:04:21,554 INFO [main] org.apache.hadoop.mapred.Task: Task:attempt_1443541526464_0114_m_000000_0 is done. And is in the process of committing 2015-10-02 15:04:21,570 INFO [main] org.apache.hadoop.mapred.Task: Task attempt_1443541526464_0114_m_000000_0 is allowed to commit now 2015-10-02 15:04:21,584 INFO [main] org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: Saved output of task 'attempt_1443541526464_0114_m_000000_0' to hdfs://rnd2-emr-head.ec2.int$ 2015-10-02 15:04:21,598 INFO [main] org.apache.hadoop.mapred.Task: Task 'attempt_1443541526464_0114_m_000000_0' done. 2015-10-02 15:04:21,616 INFO [Thread-6] amazon.emr.metrics.MetricsSaver: Inside MetricsSaver Shutdown Hook

Any suggestions would be much appreciated.

1

1 Answers

0
votes

Can you try cleaning the hdfs://tmp directory. Just take a backup of the directory as some other applications use tmp directory and in case you face any issues you can replace the tmp directory.