I have a 456kb file which is being read from hdfs and its given as input to mapper function. Every line contain a integer for which I am downloading some files and storing them on local system. I have hadoop set up on two-node cluster and the split size is changed from the program to open 8-mappers :
Configuration configuration = new Configuration();
configuration.setLong("mapred.max.split.size", 60000L);
configuration.setLong("mapred.min.split.size", 60000L);
8 mappers are created but same data is downloaded on both the servers, I think its happening because block size is still set to default 256mb and input file is processed twice. So my question is can we process a small size file with map reduce?
TextInputFormat
? – Mike Park