0
votes

I am trying to run Mahout using .\bin\hadoop jar path_to_mahout_jar etc

It only works when the input is a local file. When I try using a file from the Hadoop file system it gives this error:

Exception in thread "main" java.io.FileNotFoundException: input (The system cannot find the file specified)
        at java.io.FileInputStream.open(Native Method)
        at java.io.FileInputStream.<init>(FileInputStream.java:120)
        at org.apache.mahout.classifier.sgd.TrainLogistic.open(TrainLogistic.java:316)
        at org.apache.mahout.classifier.sgd.TrainLogistic.mainToOutput(TrainLogistic.java:75)
        at org.apache.mahout.classifier.sgd.TrainLogistic.main(TrainLogistic.java:64)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:156)

However, I can see the file when i look into HDFS.

3
How are you specifying input? it isn't shown. Try an hdfs:// URISean Owen
I specified hdfs://54.186.225.72/data but its still not working throwing error hdfs:/54.186.225.72/data with "hdfs:/" instead of "hdfs://"Meet Mehta

3 Answers

0
votes

Its strange, for me mahout was looking for files in directors in hdfs, to make mahout in my local file system I had to give a file:/// URI. May be you should try hdfs:// URI as Sean suggested for your problem.

0
votes

Trainlogistic algorithm (and some other classification algorithm as well) cannot be run on HDFS.

Check this link which says it can only be ran on single machine.

Good luck..!