How to optimize DynamoDB Query response time?

Question

We're using API gateway + Lambda function + DynamoDB to fetch data and using the DynamoDB query method. For 260.4KB data ( Total item count:675 | Scanned Count: 3327 ) it's taking 3.49s.

Requirement:

We've 4+ clients, we are calculating client sales user's data on daily basis and storing it in the DB.

Table Structure:

Primary Key: ClientId
Sort Key: Date+UserId
Other Attributes: Date

In Query - We are using Primary Key ClientId & Date to get the data.

Currently, we're using on-demand mode for the DynamoDB yet we feel the response time > 1s is too much.

Is there any way we can improve this using any AWS configurations?

Update[24/03/2021] In Lambda - We are using NodeJs.

module.exports.executeQuery = async(dynamoDbClient, queryInput) => {
  return await new Promise((resolve, reject) => {
    dynamoDbClient.query(queryInput, (err, users) => {
      if (err) {
        reject(handleQueryError(err));
      }
      else {
        resolve({
          statusCode: 200,
          users,
        });
      }
    });
  });
};

Memory Provisioned to Lambda = 128 MB

That seems long, can you add a sample item to your question and a screenshot of the query latency metrics? Which language are you using and how much RAM have you provisioned to Lambda? If you're using Python in Lambda you might benefit from this blog post about performance measurements with DynamoDB and different configurations I literally just published (disclaimer: I wrote this, it's on topic). — Maurice
you can use github.com/alexcasalboni/aws-lambda-power-tuning to help you detect the best power configuration to minimize cost and/or maximize performance. I highly recommended it and I have used it myself in similar scenarios (but instead of DynamoDB it was RDS Proxy) — oieduardorabelo
If you are only using lambda for querying purpose and not doing any changes to data returned from dynamo, you can try using api gateway as proxy for dynamodb. aws.amazon.com/blogs/compute/… — nirvana124
Increase the memory your Lambda function has and check if it gets faster, from my experience there is a very significant difference. — Maurice

F_SO_K F_SO_K · Accepted Answer · 2021-03-23T09:06:30

You're getting 3327 results so the ~3.5s response time doesn't surprise me. Sounds about right from my experience.

The underlying problem here is a lack of threads or parallel processing. You can easily prove that is the case, run this CLI command:

aws dynamodb scan --table-name YOURTABLENAME --total-segments X --segment 0 --select COUNT

Replace YOURTABLENAME and X where X should be the number of of MBs of data in your table. So if you have 100MB of data, use 100.

This will do a parallel scan with X threads. It will return in about 1s and its getting every item in your table.

You can then try a scan with --total-segments 1 (which runs with one thread) and see how much longer it takes.

What this demonstrates is the needs to get large amounts of data in parallel threads.

Your partitions are too large. If you try a key with less data, perhaps 10's of records I expect the query is fast.

You might want to look into sharding techniques to reduce the amount of data in your partitions, then you can Query those partitions in parallel. Note that DynamoDB does not provide a BatchQuery method, which is a shame, so you have to write your own parallel Query methods.

How to optimize DynamoDB Query response time?

2 Answers