How to specify path of a file in java/terminal on Hadoop?

Question

I am running a task on Hadoop2:

$hadoop jar hipi.jar "/5" "/processWOH" 1

hipi.jar: the jar file name

"/5": the input folder name

"/processWOH": the output folder name

I am getting and exception regarding the path /localhost:9000/5/LC814000.tif:

Error: java.io.FileNotFoundException: /localhost:9000/5/LC814000.tif (No such file or directory)
        at java.io.FileInputStream.open0(Native Method)
        at java.io.FileInputStream.open(FileInputStream.java:195)
        at java.io.FileInputStream.<init>(FileInputStream.java:138)
        at java.io.FileInputStream.<init>(FileInputStream.java:93)
        at ProcessWithoutHIPI.ProcessRecordReaderWOH.getCurrentKey(ProcessRecordReaderWOH.java:81)
        at ProcessWithoutHIPI.ProcessRecordReaderWOH.getCurrentKey(ProcessRecordReaderWOH.java:1)
        at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.getCurrentKey(MapTask.java:507)
        at org.apache.hadoop.mapreduce.task.MapContextImpl.getCurrentKey(MapContextImpl.java:70)
        at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.getCurrentKey(WrappedMapper.java:81)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)

I think ( I am not sure) the problem with the extra "/localhost:9000" added to the path, but I don't know how it is added ( By hadoop, java code, ...).

Notice: this jar file is running fine outside of hadoop but in hadoop (hdfs) it is not

Any help is appreciated

Update: As I discovered later that "/5" folder is searched inside the local system not inside hdfs and if I create a folder in the local file system with name "localhost:9000" under root i.e. /localhost:9000 and put "/5" the code will run, but in this case the data is taken outside from hadoop like if I am not using hadoop at all. So is this a mistake in programming i.e. I should use hadoop io packages instead of java io packages to deal with hdfs instead of local filesystem, or it is another problem.?

the prefix /localhost:9000 is about the path of hdfs; please execute the following command and past the result: $hadoop fs -ls /localhost:9000/ — Imi.Cino
@Imi.Cino Thanks. I am now out of office tomorrow morning I ll run it and submit the result. — Mosab Shaheen
/localhost:9000 is not floder; 9000 is the port of your hdfs! you can see it in your core-site.xml. please show your core-site.xml and your mapreduce programme — Imi.Cino
@Imi.Cino That's true for the correct case, but in my case it is taking all the paths from the local system not the hdfs and I don't know why. And because of that I created a folder /localhost:9000/ in the local system and it worked but now all the data are taken and written outside Hadoop!! — Mosab Shaheen
@Imi.Cino <configuration> <property> <name>fs.default.name</name> <value>hdfs://localhost:9000</value> </property> </configuration> — Mosab Shaheen

Imi.Cino Imi.Cino · Accepted Answer · 2017-03-24T20:53:56

The default directory of your hdfs is /localhost:9000/, hadoop can not find your input file there; just past it in /localhost:9000/:

$hadoop fs -put $LOCAL_PATH_OF_INPUT_FILE:/5 /localhost:9000/
$hadoop jar hipi.jar "/5" "/processWOH" 1

Good luck!

How to specify path of a file in java/terminal on Hadoop?

2 Answers