I have a DataStream from Kafka which has 2 possible value for a field in MyModel. MyModel is a pojo with domain-specific fields parsed from a message from Kafka.
DataStream<MyModel> stream = env.addSource(myKafkaConsumer);
I want to apply window and operators on each key a1, a2 separately. What is a good way to separate them? I have 2 options filter and select in mind but don't know which one is faster.
Filter approach
stream
.filter(<MyModel.a == a1>)
.keyBy()
.window()
.apply()
.addSink()
stream
.filter(<MyModel.a == a2>)
.keyBy()
.window()
.apply()
.addSink()
Split and select approach
SplitStream<MyModel> split = stream.split(…)
split
.select(<MyModel.a == a1>)
…
.addSink()
split
.select<MyModel.a == a2>()
…
.addSink()
If split and select are better, how to implement them if I want to split based on the value of a field in MyModel?
split
. Maybe I did something wrong but in my current projectenv.execute()
throws weird exception when using split. Then I replacedsplit
withfilter
and it solved my problem. – bolei