I have the following CEP PatternStream where the DataStream is partitioned based on entity ID because I am only interested in a pattern match if the entities have the same entity ID:
PatternStream<EntityMetric> patternStream = CEP.pattern(inputStream.keyBy(EntityMetric.ATTR_ENTITY_ID), thresholdPattern);
But then I noticed that the checkpoint state size increases as the number of entity IDs increases. If I understand checkpointing correctly, this is expected since the number of operator states increase. But I would like to find out if there is any other way to minimize the checkpoint state size.
Is there a different way to implement this pattern matching without partitioning the DataStream based on entity ID?
Is there other technique or configuration attribute that can help to reduce the checkpoint state size?
Thanks!