5
votes

I have already read some questions about kinesis shard and multiple consumers but I still don't understand how it works.

My use case: I have a kinesis stream with just one shard. I would like to consume this shard using different lambda function, each of them independently. It's like that each lambda function will have it's own shard iterator.

Is it possible? Set multiple lambda consumers ( stream based) reading from the same stream/shard?

5

5 Answers

5
votes

Hey Mr Magalhaes I believe the following picture should answer some of your questions.

Processing Streams: Lambda

So to clarify you can set multiple lambdas as consumers on a kinesis stream, but the Lambdas will block each other on processing. If your stream has only one shard it will only have one concurrent Lambda.

4
votes

If you have one kinesis stream, you can connect as many lambda functions as you want through an event source mapping.

All functions will run simultaneously and fully independent of each other and will constantly be invoked if new records arrive in the stream. The number of shards does not matter.

2
votes

For a single lambda function: "For Lambda functions that process Kinesis or DynamoDB streams the number of shards is the unit of concurrency. If your stream has 100 active shards, there will be at most 100 Lambda function invocations running concurrently. This is because Lambda processes each shard’s events in sequence." [https://docs.aws.amazon.com/lambda/latest/dg/scaling.html]

But there is no limit on how many different lambda consumers you want to attach with kinesis.

1
votes

Yes, no problem with this !

The number of shards doesn't limit the number of consumers a stream can have. In you case, it will just limit the number of concurrent invocations of each lambda. This means that for each consumers, you can only have the number of shards of concurrent executions.

Seethis doc for more details.

0
votes

Short answer:

Yes it will work, and will work concurrently.

Long answer:

Each shared in Kinesis stream has 2MiB/sec read throughput: https://docs.aws.amazon.com/streams/latest/dev/building-consumers.html

If you have multiple applications (in your case Lambda's). They will share the throughput. A description taken from the link above:

Fixed at a total of 2 MiB/sec per shard. If there are multiple consumers reading from the same shard, they all share this throughput. The sum of the throughput they receive from the shard doesn't exceed 2 MiB/sec.

If you create (write) less than 1mib/sec of data you should be able to support two "applications" with a single shard.

In general if you have Y shards and X applications it should work properly assuming your total write throughput (mib/sec) is less than 2mib/sec * Y / X and that data is spread equally between shards.

If you require each "Application" to use 2 Mib/sec each, you may enable "Consumers with Enhanced Fan-Out" which "fan-outs" the stream giving each application a dedicated 2 Mib/sec per shard (instead of sharing the throughput).

This is described in the following link: https://docs.aws.amazon.com/streams/latest/dev/introduction-to-enhanced-consumers.html

In Amazon Kinesis Data Streams, you can build consumers that use a feature called enhanced fan-out. This feature enables consumers to receive records from a stream with throughput of up to 2 MiB of data per second per shard. This throughput is dedicated, which means that consumers that use enhanced fan-out don't have to contend with other consumers that are receiving data from the stream.