Why can't HDFS client send directly to DataNode?
What's the advantage of HDFS client cache?
- An application request to create a file does not reach the NameNode immediately.
- In fact, initially the HDFS client caches the file data into a temporary local file.
- Application writes are transparently redirected to this temporary local file.
- When the local file accumulates data worth at least one HDFS block size, the client contacts the NameNode to create a file.
- The NameNode then proceeds as described in the section on Create. The client flushes the block of data from the local temporary file to the specified DataNodes.
- When a file is closed, the remaining un-flushed data in the temporary local file is transferred to the DataNode.
- The client then tells the NameNode that the file is closed.
- At this point, the NameNode commits the file creation operation into a persistent store. If the NameNode dies before the file is closed, the file is lost.