0
votes

I use hadoop file system 3.3.0 with hadoop java client api on windows 10. Belows are the hadoop configuration files.

== core-site.xml

<configuration>
    <property>
        <name>fs.default.name</name>
        <value>hdfs://localhost:9000</value>
     </property>
</configuration>

== hdfs-site.xml

<configuration>
    <property>
        <name>dfs.replication</name>
          <value>1</value>
     </property>
     <property>
        <name>dfs.http.address</name>
          <value>localhost:50070</value>
     </property>        
    <property>
        <name>dfs.name.dir</name>
        <value>file:///C:/hadoop-3.3.0/data/hdfs/namenode</value>
     </property>
     <property>
          <name>dfs.data.dir</name>
          <value>file:///C:/hadoop-3.3.0/data/hdfs/datanode</value>
     </property>
</configuration>

And I try to save some files into hadoop file system with java api.

Configuration conf = new Configuration();
conf.addResource(new Path("file:///C:/hadoop-3.3.0/etc/hadoop/core-site.xml"));
conf.addResource(new Path("file:///C:/hadoop-3.3.0/etc/hadoop/hdfs-site.xml"));
FileSystem hadoopFs = FileSystem.get(conf);

Path hadoopPath = new Path(filename);
        
FSDataOutputStream hadoopOutStream = null;
BufferedWriter bw = null;
        
if(hadoopFs.exists(hadoopPath)) {
    hadoopOutStream =hadoopFs.append(hadoopPath);
} else {
    hadoopOutStream = hadoopFs.create(hadoopPath, true); 
}
        
bw = new BufferedWriter(new OutputStreamWriter(hadoopOutStream,StandardCharsets.UTF_8));
bw.write("data....");
bw.close();
hadoopOutStream.close();

File writing is successful and throws no exception. But the problem is the path of saved files is not hadoop home folder which I describe on configuration files.

When I execute hadoop command interface, the output is like below,

>hdfs dfs -ls /user/joseph
Found 1 items
-rw-r--r--   1 joseph supergroup     172120 2020-12-11 17:34 /user/joseph/saved_data.csv

The default folder of the saved file is not included on hadoop_home, but the windows user folder. How can I modify the default save folder? Any reply will be thankful. Best regards

1
HADOOP_HOME is a local filesystem path. The default HDFS write path is /user/$(whoami), and I'm not sure there is a way to overwrite that without specificying an absolute path - OneCricketeer

1 Answers

2
votes

The HADOOP_HOME is the path where your Hadoop installation is there, it has nothing to do with hdfs path. By default HDFS will store a directory in its /user/nameOfUser directory. The /user part is configurable by the config dfs.user.home.dir.prefix The default value of which is /user You can change this in your hdfs client configurations, to use some other prefix rather than /user