I asked a similar question a while back but then I had not idea what I was talking about. I am posting this question with further details and to the point queries.
So I have set up hadoop cluster with namenode and 2 datanodes. I am using hadoop 2.9.0. I ran the command hdfs dfs -put "SomeRandomFile" and it seems to be working ok. The only confusion I have here is why does it store my file to /user/hduser/ path? I didn't specify this path anywhere in configurations so how is it building this path on hdfs?
Furthermore I created a small java program to do the same thing. I created a simple eclipse project and wrote following lines:
public static boolean fileWriteHDFS(InputStream input, String fileName) {
try {
System.setProperty("HADOOP_USER_NAME", "hduser");
//Get Configuration of Hadoop system
Configuration conf = new Configuration();
conf.set("fs.defaultFS", "hdfs://localhost:9000");
//conf.get("fs.defaultFS");
//Extract destination path
URI uri = URI.create(DESTINATION_PATH+fileName);
Path path = new Path(uri);
//Destination file in HDFS
FileSystem fs = FileSystem.get(uri, conf); //.get(conf);
//Check if the file already exists
if (fs.exists(path))
{
//Write appropriate error to log file and return.
return false;
}
//Create an Output stream to the destination path
FSDataOutputStream out = fs.create(path);
//Copy file from input steam to HDFSs
IOUtils.copyBytes(input, out, 4096, true);
//Close all the file descriptors
out.close();
fs.close();
//All went perfectly as planned
return true;
} catch (Exception e) {
//Something went wrong
System.out.println(e.toString());
return false;
}
}
And I added following three hadoop libraries:
/home/hduser/bin/hadoop-2.9.0/share/hadoop/common/hadoop-common-2.9.0.jar /home/hduser/bin/hadoop-2.9.0/share/hadoop/common/hadoop-common-2.9.0-tests.jar /home/hduser/bin/hadoop-2.9.0/share/hadoop/common/hadoop-nfs-2.9.0.jar
As you can see my hadoop installation location is /home/hduser/bin/hadoop-2.9.0/... When I run this code it throws an exception. i.e.
Exception in thread "main" java.lang.NoClassDefFoundError: com/ctc/wstx/io/InputBootstrapper
at com.ws.filewrite.fileWrite.fileWriteHDFS(fileWrite.java:21)
at com.ws.main.listenerService.main(listenerService.java:21)
Caused by: java.lang.ClassNotFoundException: com.ctc.wstx.io.InputBootstrapper
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 2 more
Specifically the exception is thrown at the lie:
Configuration conf = new Configuration();
Am I missing something here? What is causing this problem? I am completely new to HDFS so pardon me it is obvious problem.
-tests
, secondly, there are other libraries those JARs depend on. Suggestion: Use Maven, not manually adding JAR files to your classpath - OneCricketeerdfs -put
command - OneCricketeer