Say I have an HDFS cluster (v 2.0.5) containing multiple racks but it was not originally setup with rack awareness. Data has been loaded into it with the default 3x replication. If I now configure HDFS to be rack aware, the three replicas of a block could very well be on the same rack, which is not what I want.
If my cluster is already balanced, would running the HDFS balancer enforce the block replication policy and shuffle blocks around appropriately, i.e. have one block on a rack and two blocks on another rack? From what I have read about it, it seems like if the cluster is balanced it would simply exit the process.
If not, how can I force HDFS to re-replicate the needed blocks to separate racks?