I have just configured a clone hadoop version 2.7.3, I load my data sizes from 1 g up to 20 go and I use this data (can manipulate them ...) but when I restart the cluster this data does not Will not be accecible. I will have this message: WARNING : There are about xx missing blocks. Please check the log or run fsck, it means that some blocks in your HDFS installation do not have a single replica on any of the live DataNodes. here is the hdfs-site.xml:
<configuration>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///home/hduser/hadoop-2.7.3/namenode</value>
<description>NameNode directory for namespace and transaction logs storage.</description>
</property>
<property>
<name>dfs.safemode.threshold.pct</name>
<value>0</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
<property>
<name>dfs.datanode.use.datanode.hostname</name>
<value>false</value>
</property>
<property>
<name>dfs.namenode.datanode.registration.ip-hostname-check</name>
<value>false</value>
</property>
<property>
<name>dfs.namenode.http-address</name>
<value>node1:50070</value>
<description>Your NameNode hostname for http access.</description>
</property>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>node1:50090</value>
<description>Your Secondary NameNode hostname for http access.</description>
</property>
</configuration>