I found similar question Hadoop HDFS is not distributing blocks of data evenly
but my ask is when replication factor = 1
I still want to understand why HDFS is not evenly distributing file blocks across the cluster nodes? This will result in data skew from start, when I load/run dataframe ops on such files. Am I missing something?