I have a Hadoop cluster that have 8 machines and all the 8 machines are data nodes. There's a program running on one machine(say machine A) that will create sequence files ( each of the file is about 1GB) in HDFS continuously.
Here's the problem: All of the 8 machines are the same hardware and has the same capacity. When other machines still have about 50% free space on the disks for HDFS, machine A has only 5% left. I checked the block info and found that almost every block has one replica on machine A.
Is there any way to balance the replicas? Thanks.