I am running nutch2.3 on the hadoop2.5.2 and hbase 0.98.12 with gora 0.6, when doing the process of nutch generate, hadoop throw an eofexception. any suggestion is welcome.
2015-05-18 15:22:06,578 INFO [main] mapreduce.Job (Job.java:monitorAndPrintJob(1362)) - map 100% reduce 0% 2015-05-18 15:22:13,697 INFO [main] mapreduce.Job (Job.java:monitorAndPrintJob(1362)) - map 100% reduce 50% 2015-05-18 15:22:14,720 INFO [main] mapreduce.Job (Job.java:printTaskEvents(1441)) - Task Id : attempt_1431932258783_0006_r_000001_0, Status : FAILED Error: java.io.EOFException at org.apache.avro.io.BinaryDecoder.ensureBounds(BinaryDecoder.java:473) at org.apache.avro.io.BinaryDecoder.readInt(BinaryDecoder.java:128) at org.apache.avro.io.BinaryDecoder.readIndex(BinaryDecoder.java:423) at org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:229) at org.apache.avro.io.parsing.Parser.advance(Parser.java:88) at org.apache.avro.io.ResolvingDecoder.readIndex(ResolvingDecoder.java:206) at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:152) at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:177) at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:148) at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:139) at org.apache.hadoop.io.serializer.avro.AvroSerialization$AvroDeserializer.deserialize(AvroSerialization.java:127) at org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKeyValue(ReduceContextImpl.java:146) at org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKey(ReduceContextImpl.java:121) at org.apache.hadoop.mapreduce.lib.reduce.WrappedReducer$Context.nextKey(WrappedReducer.java:302) at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:170) at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
2015-05-18 15:22:21,901 INFO [main] mapreduce.Job (Job.java:printTaskEvents(1441)) - Task Id : attempt_1431932258783_0006_r_000001_1, Status : FAILED Error: java.io.EOFException at org.apache.avro.io.BinaryDecoder.ensureBounds(BinaryDecoder.java:473) at org.apache.avro.io.BinaryDecoder.readInt(BinaryDecoder.java:128) at org.apache.avro.io.BinaryDecoder.readIndex(BinaryDecoder.java:423) at org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:229) at org.apache.avro.io.parsing.Parser.advance(Parser.java:88) at org.apache.avro.io.ResolvingDecoder.readIndex(ResolvingDecoder.java:206) at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:152) at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:177) at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:148) at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:139) at org.apache.hadoop.io.serializer.avro.AvroSerialization$AvroDeserializer.deserialize(AvroSerialization.java:127) at org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKeyValue(ReduceContextImpl.java:146) at org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKey(ReduceContextImpl.java:121) at org.apache.hadoop.mapreduce.lib.reduce.WrappedReducer$Context.nextKey(WrappedReducer.java:302) at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:170) at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
2015-05-18 15:22:28,986 INFO [main] mapreduce.Job (Job.java:printTaskEvents(1441)) - Task Id : attempt_1431932258783_0006_r_000001_2, Status : FAILED Error: java.io.EOFException at org.apache.avro.io.BinaryDecoder.ensureBounds(BinaryDecoder.java:473) at org.apache.avro.io.BinaryDecoder.readInt(BinaryDecoder.java:128) at org.apache.avro.io.BinaryDecoder.readIndex(BinaryDecoder.java:423) at org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:229) at org.apache.avro.io.parsing.Parser.advance(Parser.java:88) at org.apache.avro.io.ResolvingDecoder.readIndex(ResolvingDecoder.java:206) at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:152) at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:177) at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:148) at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:139) at org.apache.hadoop.io.serializer.avro.AvroSerialization$AvroDeserializer.deserialize(AvroSerialization.java:127) at org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKeyValue(ReduceContextImpl.java:146) at org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKey(ReduceContextImpl.java:121) at org.apache.hadoop.mapreduce.lib.reduce.WrappedReducer$Context.nextKey(WrappedReducer.java:302) at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:170) at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
2015-05-18 15:22:37,078 INFO [main] mapreduce.Job (Job.java:monitorAndPrintJob(1362)) - map 100% reduce 100% 2015-05-18 15:22:37,109 INFO [main] mapreduce.Job (Job.java:monitorAndPrintJob(1375)) - Job job_1431932258783_0006 failed with state FAILED due to: Task failed task_1431932258783_0006_r_000001 Job failed as tasks failed. failedMaps:0 failedReduces:1
2015-05-18 15:22:37,256 INFO [main] mapreduce.Job (Job.java:monitorAndPrintJob(1380)) - Counters: 50 File System Counters FILE: Number of bytes read=22 FILE: Number of bytes written=232081 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=612 HDFS: Number of bytes written=0 HDFS: Number of read operations=1 HDFS: Number of large read operations=0 HDFS: Number of write operations=0 Job Counters Failed reduce tasks=4 Launched map tasks=1 Launched reduce tasks=5 Rack-local map tasks=1 Total time spent by all maps in occupied slots (ms)=10399 Total time spent by all reduces in occupied slots (ms)=23225 Total time spent by all map tasks (ms)=10399 Total time spent by all reduce tasks (ms)=23225 Total vcore-seconds taken by all map tasks=10399 Total vcore-seconds taken by all reduce tasks=23225 Total megabyte-seconds taken by all map tasks=10648576 Total megabyte-seconds taken by all reduce tasks=23782400 Map-Reduce Framework Map input records=1 Map output records=1 Map output bytes=32 Map output materialized bytes=62 Input split bytes=612 Combine input records=0 Combine output records=0 Reduce input groups=0 Reduce shuffle bytes=14 Reduce input records=0 Reduce output records=0 Spilled Records=1 Shuffled Maps =1 Failed Shuffles=0 Merged Map outputs=1 GC time elapsed (ms)=175 CPU time spent (ms)=6860 Physical memory (bytes) snapshot=628305920 Virtual memory (bytes) snapshot=3198902272 Total committed heap usage (bytes)=481820672 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0 File Input Format Counters Bytes Read=0 File Output Format Counters Bytes Written=0 2015-05-18 15:22:37,266 ERROR [main] crawl.GeneratorJob (GeneratorJob.java:run(310)) - GeneratorJob: java.lang.RuntimeException: job failed: name=[t2]generate: 1431933684-12185, jobid=job_1431932258783_0006 at org.apache.nutch.util.NutchJob.waitForCompletion(NutchJob.java:54) at org.apache.nutch.crawl.GeneratorJob.run(GeneratorJob.java:213) at org.apache.nutch.crawl.GeneratorJob.generate(GeneratorJob.java:241) at org.apache.nutch.crawl.GeneratorJob.run(GeneratorJob.java:308) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.nutch.crawl.GeneratorJob.main(GeneratorJob.java:316) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Error running: /usr/pro/nutch2.3/deploy/bin/nutch generate -D mapred.reduce.tasks=2 -D mapred.child.java.opts=-Xmx1000m -D mapred.reduce.tasks.speculative.execution=false -D mapred.map.tasks.speculative.execution=false -D mapred.compress.map.output=true -topN 50000 -noNorm -noFilter -adddays 0 -crawlId t2 -batchId 1431933684-12185