0
votes

I have followed nutch2 tutorial and integrate nutch with HBase successfully My Problem is when i crawl the url using following command ./nutch crawl urls/seed.txt abc -depth 50 -topN 50 in runtime/local/bin directory ,

Error occured :

Exception in thread "main" java.lang.RuntimeException: job failed: name=generate: null, jobid=job_local1552667151_0002
        at org.apache.nutch.util.NutchJob.waitForCompletion(NutchJob.java:54)
        at org.apache.nutch.crawl.GeneratorJob.run(GeneratorJob.java:199)
        at org.apache.nutch.crawl.Crawler.runTool(Crawler.java:68)
        at org.apache.nutch.crawl.Crawler.run(Crawler.java:152)
        at org.apache.nutch.crawl.Crawler.run(Crawler.java:250)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
        at org.apache.nutch.crawl.Crawler.main(Crawler.java:257)

please give me solution. Any solution will be appreciated.

1
Were you able to solve this issue? I am having the same issue. - Anirudhan J

1 Answers

0
votes

As most people might suggest, hadoop.log is a good place to look for a better description of the error. In the absence of that information I will hazard the following guesses:

  1. you have setup nutch on a windows box
  2. you are running hbase in cygwin (attempting to run hbase directly in a windows command prompt will most likely fail anyway)
  3. you are probably running into an hdfs file system bug (checking hadoop.log will tell if you this is the case).

Here's a workaround posted in apache issues jira: https://issues.apache.org/jira/browse/HADOOP-7682 Another kind soul put out a patch for it: https://github.com/congainc/patch-hadoop_7682-1.0.x-win If this is indeed the issue you are running into, use the WinLocalFileSystem class mentioned in the patch above and configure nutch to use it by adding the following in your nutch-site.xml:

<property>
    <name>fs.file.impl</name>
    <value>org.apache.nutch.util.WinLocalFileSystem</value>
    <description>Enables patch for issue HADOOP-7682 on Windows
    </description>
</property>