1
votes

I am trying to run Apache Nutch 1.15 (local) on Windows 10, I have followed the same steps as mentioned on https://wiki.apache.org/nutch/NutchTutorial and https://wiki.apache.org/nutch/NutchHadoopSingleNodeTutorial. When I try to inject the urls using this command on cygwin : bin/nutch inject crawl/crawldb urls i get this error:

Injector: java.io.IOException: (null) entry in command string: null chmod 
0644 C:\Users\INFO\Desktop\apache-nutch1.15\runtime\local\crawl\crawldb\.locked

when put %HADOOP_HOME% on system path (solution proposed on Apache Nutch error: Injector: java.io.IOException: (null) entry in command string: null chmod 0644)

i get a new error :

Exception in thread "main" java.lang.UnsatisfiedLinkError: org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Ljava/lang/String;I)Z

by the way as HADOOP_HOME i've tried both hadoop-2.8.0 and winutil folder but same problem.

please help.

1

1 Answers

0
votes

This is a known issue in nutch. Please find the JIRA and the fix below. If you apply the changes from GIT in your local bin/nutch file, everything works fine again. This will be included as a fix when nutch 1.16 gets released

JIRA: https://issues.apache.org/jira/browse/NUTCH-2639?page=com.atlassian.jira.plugin.system.issuetabpanels%3Aall-tabpanel

FIX: https://github.com/apache/nutch/pull/378/commits/7e4502089ecebd194c75719485b6fce1a65797e9