My flink application does the following
- source: read data in form of records from Kafka
- split: based on certain criteria
- window : timewindow of 10seconds to aggregate into one bulkrecord
- sink: dump these bulkrecords to elasticsearch
I am facing issue where flink consumer is not able to hold data for 10seconds, and is throwing the following exception:
Caused by: java.util.concurrent.ExecutionException: java.io.IOException: Size of the state is larger than the maximum permitted memory-backed state. Size=18340663 , maxSize=5242880
I cannot apply countWindow, because if the frequency of records is too slow, then the elasticsearch sink might be deferred for a long time.
My question:
Is it possible to apply a OR function of TimeWindow and CountWindow, which goes as
> if ( recordCount is 500 OR 10 seconds have elapsed)
> then dump data to flink