0
votes

In the execution of hdfs dfs -ls command I wuold like to know if the result are all the files stored in the cluster or just the partitions in the node where it is executed. I'm a newby in hadoop and I´m having some problems serching the partitions in each node.

Thank you

1

1 Answers

0
votes

Question: "...if the result are all the files stored in the cluster or..."

What you see from ls command are all the files stored in the cluster. More specifically, what you see is a bunch of file paths and names. These information is part of namespace, which is managed by a Namenode.

"...just the partitions in the node where it is executed.."

If you thought hdfs keeps some files on this node, and some files on the other node. You misunderstood. There's no such thing. NameNode keeps tracks of namespace, and blocksMap. In fact, Files are composed of blocks. NameNode knows the file has how many blocks and on which DataNodes the blocks are kept. NameNode decides where the blocks are kept, it's transparent to the user. Each block has 3 replication by default, and each replication is on one DataNode. So Assume a file has 2 blocks, it could be located on at most 6 DataNodes, No DataNode keeps the complete files(true in this example. Because in another common case when a small file has only 1 block, each replication is a complete file).

For more information, take a look at the official document of Hdfs Design