Lambda Kinesis Stream Consumer Concurrency Question

Question

I have a Lambda with an Event source pointed to a Kinesis Stream Consumer. The stream has 30 shards.

I can see requests coming in on the lambda console, and I can see metrics in the Enhanced Fan Out section of the kinesis console so it appears everything is configured right.

However, my concurrent executions of the lambda is capped at 10 for some reason and I can't figure out why. Most of the documentation suggests that when Enhanced Fan Out is used and a lambda is listening to a stream consumer, then one lambda per shard should be running.

Can anyone explain how concurrency with lambda stream consumers work?

For the sake of completeness, can you show the put metrics for your shards to demonstrate that more than 10 have records in concurrently? — thomasmichaelwallace

Javier Ramirez Javier Ramirez · Accepted Answer · 2019-05-10T08:21:41

I have a couple of pointers just in case. The first thing is to make sure your lambda concurrency limit is actually over 10. It should be, as it defaults to 1000, but it doesn't hurt to check.

About the explanation of how lambda stream consumers work, you have the details at the lambda docs.

One thing I've seen often with Kinesis Data Streams, is having trouble with the Partition Key of the records. As you probably know, Kinesis Data Streams will send all the records with the same partition key to the same shard, so they can be processed in the right order. If records were sent to any shard (for example using a simple round-robin) then you couldn't have any guarantee they would be processed in order, as different shards are read by different processors.

It's important then to make sure you are distributing your keys as evenly as possible. If most records have the same partition key, then one of the shards will be very busy while the others are not getting traffic. It might be the case you are using only 10 different values for your partition keys, in which case you would be sending data to only 10 shards, and since a lambda function execution will be connected to only one shard, you have only 10 concurrent executions.

You can know the shard Id you are using by checking the output of PutRecord. You can also force a shard ID by overriding the Hashing mechanism. There is more information about partition keys processing and record sorting at the SDK docs.

Also make sure you read the troubleshooting guide, as sometimes you can get records processed by two processors concurrently and you might want to be prepared for that.

I don't know if your issue will be related to these pointers, but the Key Partitioning is a recurrent issue, so I thought I would comment on it. Good luck!

Lambda Kinesis Stream Consumer Concurrency Question

1 Answers