Generating multiple equally sized output files in Hadoop

Question

What are some methods for finding X data ranges in Hadoop so that one can use these ranges as partitions in the reducer step?

Tariq Tariq · Accepted Answer · 2013-06-19T19:49:47

Looks like you need something like TotalOrderPartitioner, which allows a total order by reading split points from an externally generated source. You might find this link useful : http://chasebradford.wordpress.com/2010/12/12/reusable-total-order-sorting-in-hadoop/.

Don't know if this is exactly what you need? Apologies if I have get it wrong.