2
votes

I have set the JAVA_HOME in unix environment.The Problem it looks like a path of Class.I don't know...

when I execute This commad line:

ahmed@ubuntu:~/apache-nutch-1.9/bin$ ./nutch bin/Crawl

I got this Exception:

Exception in thread "main" java.lang.NoClassDefFoundError: bin/Crawl Caused by: java.lang.ClassNotFoundException: bin.Crawl at java.net.URLClassLoader$1.run(URLClassLoader.java:217) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:205) at java.lang.ClassLoader.loadClass(ClassLoader.java:323) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294) at java.lang.ClassLoader.loadClass(ClassLoader.java:268) Could not find the main class: bin/Crawl. Program will exit.

Can I have A answer.

1
The jar file is not in the class path, that is why you are getting this error...Danyal Sandeelo

1 Answers

1
votes

It does not exist a command called 'bin/Crawl'. If you execute ./bin/nutch you get the list of commands:

Usage: nutch COMMAND
 where COMMAND is one of:
 inject         inject new urls into the database
 hostinject     creates or updates an existing host table from a text file
 generate       generate new batches to fetch from crawl db
 fetch          fetch URLs marked during generate
 parse          parse URLs marked during fetch
 updatedb       update web table after parsing
 updatehostdb   update host table after parsing
 readdb         read/dump records from page database
 readhostdb     display entries from the hostDB
 index          run the plugin-based indexer on parsed batches
 elasticindex   run the elasticsearch indexer - DEPRECATED use the index command instead
 solrindex      run the solr indexer on parsed batches - DEPRECATED use the index command instead
 solrdedup      remove duplicates from solr
 solrclean      remove HTTP 301 and 404 documents from solr - DEPRECATED use the clean command instead
 clean          remove HTTP 301 and 404 documents and duplicates from     indexing backends configured via plugins
 parsechecker   check the parser for a given url
 indexchecker   check the indexing filters for a given url
 plugin         load a plugin and run one of its classes main()
 nutchserver    run a (local) Nutch server on a user defined port
 webapp         run a local Nutch web application
 junit          runs the given JUnit test
 or
 CLASSNAME  run the class named CLASSNAME
Most commands print help when invoked w/o parameters.

Since 'bin/Crawl' command does not exist, it assumes it is a CLASSNAME, therefore the error.

In the past existed a ./bin/nutch crawl (deprecated), but now there is a specific script for crawling. Use this:

./bin/crawl