Suppose my mappers output N keys (these keys are different), and I have K reducers. How to write custom Paritioner so that each reducer receive approximately N/K keys? Which keys going to which receives is not important.
Example: Suppose my mappers output 10 pairs <k1,v1>,<k2,v2>,<k3,v3>,...<k10,v10>
, and I have 3 reducers. I want 3 pairs going to 1st Reducer, 3 pairs going to 2nd, 4 pairs going to 3rd, no matter which keys going to which reducers.
What I attempted:
- Randomly assign reducer. E.g., randomly assign
<k1,v1>
to 1st reducer,<k2,v2>
to 2st reducer, and so on. But still there are reducers get much more data than others - I do not want to fix which keys going to which reducers. Because the keys
k1,k2,...k10
of my mappers changes according to input data --> I have to change code for each input data. Moreover, these keys have equal roles. I just need to distribute them equally between reducers.
Thanks a lot.