I have a setup with a kinesis stream with 20 shards that is consumed by a kinesis consumer based on KCL. The consumer is deployed in ECS with 20 instances.(Meaning multiple KCL instances?)
What I believed would happen in this scenario is:
- Each instance would create 20 worker threads for each shard, independently of each other.
- So at any given time, a shard would have 20 separate threads connecting to it
- The same set of records would get processed by each instance (ie: duplicate record processing will not be handled across the instances)
- This would also exceed the consumer rate limits per each shard. ( 5 transactions per second)
- Running a single instance of my consumer is sufficient. In other words, scaling the consumer across multiple instances will not have any benefits at all.
This answer seem to suggest that the "shard's lease" would ensure that it is only processed by a single instance. However, the second answer here says that "A KCL instance will only start one process per shard, but you can have another KCL instance consuming the same stream (and shard), assuming the second one has permission.".
Further this documentation suggests "Increasing the number of instances up to the maximum number of open shards" as a possible scale-up approach which contradicts some of the above points.
How does the consumer instances actually function in this scenario?