The core-site.xml file informs Hadoop daemon where NameNode runs in the cluster. It contains the configuration settings for Hadoop Core such as I/O settings that are common to HDFS and MapReduce.
The hdfs-site.xml file contains the configuration settings for HDFS daemons; the NameNode, the Secondary NameNode, and the DataNodes. Here, we can configure hdfs-site.xml to specify default block replication and permission checking on HDFS. The actual number of replications can also be specified when the file is created. The default is used if replication is not specified in create time.
I'm looking to understand which processes [Namenode, Datanode, HDFS client] need access to which of those configuration files?
- Namenode: I presume it only needs
hdfs-site.xml
because it doesn't need to know its own location. - Datanode: I presume it needs access to both
core-site.xml
(to locate the namenode) andhdfs-site.xml
(for various settings)? - HDFS client: I presume it needs access to both
core-site.xml
(to locate the namenode) andhdfs-site.xml
(for various settings)?
Is that accurate?