How to split dynamoDB partitions efficiently?

Question

I have a use-case where the number of partitions generated are low which gives throttling issues.

Lets say my item has few fields and three of them are organizationId, createdTime and itemType. We are trying to achieve the pagination and we want to retrieve the items in descending order of the createdTime.

The GSI we had was organizationId (hash) and createdTime (range) (very bad). The reason we chose this because, this is the only way we can retrieve items in sorted order for the whole organization. Later we started appending itemType to organizationId which then the hash-key became as organizationId-itemType. But these itemTypes are like a handful of them, so we are still seeing throttling issues.

I want to make this performance efficient. If we split the records into lets say random 10/20/50 partitions, gathering all the data and give the data in sorted order is too much of a heavy operation and time consuming. I know its worst.

I know there should be many such use cases for many who worked on dynamoDB. How do people achieve this in dynamo? Do you say the use-case is wrong for dynamoDB or any ideas to make this better (like counter.. each countered partition have limited set of records.. lock the countered partition if there are any concurrent operations happening.. so on)?

Your ideas/suggestions would really help solve this huge use-case for us.

Adrian Praja Adrian Praja · Accepted Answer · 2019-05-02T19:43:32

You could simply assign a uniq id/hash for each record and create a hash only table on uniqid.

Then add as many GSI indexes as necessary.
Eg.: organisationid+createdTime

Most of the times having a GSI index with projected atributes = KEYS ONLY is the best option as it is small and fast and can extract thousand items in one query. Also table reads are cheaper, even 10 times cheaper in case of non consistend reads while non KEYS ONLY indexes updates GSI as well, costing writes.

Perfect case for KEYS ONLY:
Display data paginated, for each chunk of 50/100 items, do a batch get of the items.

Also, Instead of creating another index for itemType, you can use filterExpression to only select the desired itemTypes and do as many queries until you get the desired number of records to return and then enrich your data with batch read

How to split dynamoDB partitions efficiently?

1 Answers