For example, I have a big stream of words and want to count each word. The problem is these words is skewed. It means that the frequency of some words would be very high, but that of most other words is low. In storm, we could use the following way to solve this issue. First do shuffle grouping on the stream, in each node count words local in a window time, at the end update counts to cumulative results. From my another question, I know that Flink only supports window on a keyed stream, otherwise the window operation will not be parallel.
My question is is there a good way to solve this kind of skewed data issue in Flink?