0
votes

NameNode in hadoop does not store the block information. It is kept in-memory and on startup DataNodes report the block information.

If I copyFromLocal a file to hdfs, it is transferred to hdfs, because I can see with "hadoop fs -ls".

I was wondering how Hadoop knows which filename correspond to which blocks.

1

1 Answers

1
votes

The NameNode maintains a File System Image which stores the mapping between files -> blocks. It also stores an edit log which maintains any edits to the File System. The Secondary namenode periodically reads the File System Image and the Edit Log from the Namenode, and combines them to create the new File System Image for the NameNode.