I am working on a four node multi cluster in hadoop. I have run a series of experiments with the block sizes as follows and calculated run time as follows.
All of them are performed on 20GB input file. 64MB - 32 min, 128MB - 19 Min, 256MB - 15 min, 1GB - 12.5 min.
Should I proceed further in going for 2GB block size? Also kindly explain an optimal block size if similar operations are performed on 90GB file. Thanks!