Dynamodb GSI query response is very slow

Question

I have a dynamodb table which contains information of the status of different cron jobs.

Table attributes:

id [HashKey]
jobId [RangeKey]
status ('failed','pending', 'success')

I want to query the items based on the job status field.

Eg: list all jobs which are in pending state?

So I created the GSI as below.

GSI:

{
        IndexName: 'StatusIndex',
    
        KeySchema: [
          {
            AttributeName: 'status',
            KeyType: 'HASH',
          },
        ],
        Projection: {
          ProjectionType: 'ALL',
        },
      },

But the query on GSI is very slow when all the items contains same status value.

id	jobId	status
1	job1	pending
2	job2	pending
3	job3	pending
4	job4	pending

Is this because of not having range key?

How slow? What performance are you seeing and what are you expecting? Can you should us how you are querying the index? — Seth Geoghegan

F_SO_K F_SO_K · Accepted Answer · 2020-12-17T13:56:16

You might be better of with a Parallel Scan here. A Query does not have parallel functionality. If you're trying to get a very large amount of data in one Query, it will be slow. If you use a Parallel Scan, set the number of threads to match the number of MBs of data in your table to optimise the speed. This will cost you more RCUs than a Query.

Alternatively you can consider remodeling your data. You will need a way of running multiple Queries to access the desired data, and a way of running them in parallel from your client. One option you can consider is breaking the data down into time series.

Dynamodb GSI query response is very slow

1 Answers