According to Apache documentation on Hdfs Federation, the system is scalable through Federation of multiple name nodes in isolation.
Multiple Namenodes/Namespaces
In order to scale the name service horizontally, federation uses multiple independent Namenodes/namespaces. The Namenodes are federated; the Namenodes are independent and do not require coordination with each other. The Datanodes are used as common storage for blocks by all the Namenodes.
My Only doubt :
I did not see any central coordinator among Name nodes since all are running isolation. So confused on how jobs are getting submitted and processed.
1) If I submit a map-reduce job, which Name Node will process it? OR
2) Is client should be aware of Name node for which job has to be submitted?
If Client is not aware of which name node, there should be some "Master Name node" to take care of assigning job to a particular Name Node.
How does it work?
Thanks in advance.