1
votes

Assumption:

lets assume rate of data insertion into DynamoDB is huge.

Context:

Streams are enabled on DynamoDB table and this will trigger lambda. The lambda reads the streamed record and indexes the record in elastic search.

Problem statement:

There is a delay between the time that record got inserted into DynamoDB and the time lambda got triggered through streamed record. This delay or lag kept increasing and is directly proportional to the amount of data getting inserted into dynamoDB.

How to find where the lag is? Is it the stream is not immediately triggering the lambda? or as there are huge dynamo writes the stream is getting hampered? or is there any limit that lambda could not be hit for certain number of times in a given second?

I couldn't understand where the issue is because I could not even see whether currently the streams still contains records OR the records in streams have already been delivered but lambda trigger is the lag?

Example delay: We did huge writes yesterday and we are seeing those records hitting the lambda today! incredible delay! :)

Any suggestions please?

1
Are you looking at the monitoring tab for lambda? You should see the IteratorAge, the execution time, concurrent invocations. What is the average execution time?jny
There are concurrent invocations that are upto 4.6k but yeah those spike are not like consistent but yeah I am seeing enough spikes ranging from 800 - 4.6k. ***** And iterator age going upto 43M to 110M **** Duration avg is hitting 30k to 300kDeepak
The execution time is the problem. First you can try increasing the memory for lambda, but you also need to look at other ways to lower the execution time. Also make sure that that you don't have partitions with large number of items.jny
I think I figured out. The amount of triggers to the lambda are huge and the iteratorAge went super high. We need to optimize the lambda so that it becomes available for all those thats in-comingDeepak

1 Answers

1
votes

From Lambda Documentation

For Lambda functions that process Kinesis or DynamoDB streams the number of shards is the unit of concurrency. If your stream has 100 active shards, there will be at most 100 Lambda function invocations running concurrently. This is because Lambda processes each shard’s events in sequence.

The logic of creating shard is not exposed to end user. But it is dependent on your RSU and WSUs. But increasing them too much will cost you money.

On the Top of this, there is a limit on concurrent lambda par account. Depending on your region of operation (see here ).

Here are few things that you can do

  1. Make Sure that While ingesting data in dynamodb you are using random PK instead of some ordered PK, so that probability of hitting different shard is increased.
  2. Make Sure you are reusing connection to Elastic Search while ingesting data.
  3. Increase the batch size so that multiple records can be ingested by same lambda function. see Batch size here
  4. Use scripting language instead of java for reducing cold start issues.
  5. See if there are other lambdas that are running and you are hitting the max concurrent limit of lambda (It should be highly unlikely.)