Algorithmically, there is not much you can do, as the nature of this algorithm is the one that you describe. The only option, in this respect, is to use more clusters and divide your data to more reducers, but this yields a different result.
So, the only thing that you can do, in my opinion, is compressing. And I do not only mean, using a compression codec of Hadoop.
For instance, you could find a compact representation of your data. E.g., give an integer id to each element and only pass this id to the reducers. This will save network traffic (store elements as VIntWritables, or define your own VIntArrayWritable extending ArrayWritable) and memory of each reducer.
In this case of k-means, I think that a combiner is not applicable, but if it is, it would greatly reduce the network and the reducer's overhead.
EDIT: It seems that you CAN use a combiner, if you follow this iterative implementation. Please, edit your question to describe the algorithm that you have implemented.
Reducer would receive too many values for a particular key, so what? - luoluoSecondary sort in Mapreducegist.github.com/airawat/6604175 - luoluo