0
votes

I'm designing a table in DynamoDB which will contain a large number of records, each with a unique ID and a timestamp. I will need to retrieve a set of records that fall between two dates, irrespective of all other property values.

Adding a global secondary index for the timestamp field seems like a logical solution, however this isn't straightforward.

The Query command in DynamoDB requires a KeyConditionExpression parameter, which determines which results are returned by the query. From the DynamoDB developer guide:

To specify the search criteria, you use a key condition expression—a string that determines the items to be read from the table or index. You must specify the partition key name and value as an equality condition. You can optionally provide a second condition for the sort key (if present).

Since the partition key must be specified exactly, it cannot be used for querying a range of values. A range could be specified for the sort key, but only in addition to an exact match on the partition key. Hence, the only way I can see this working is to add a dummy field for the index partition key, where every record has the same value, then perform the query on the timestamp as the sort key. This seems hacky, and, presumably, is not how it's intended to be used.

Some answers to similar questions suggest using the Scan command instead of Query, however this would be very inefficient when fetching a small number of records from a very large table.

Is it possible to efficiently query the table to get all records where a condition matches the timestamp field only?

1
Are there any interesting details/constarints around your primary access pattern? For example, is there a maximum time range between the two dates? Can you define what you mean by a "large" number of records? How often will you be making these queries? With DynamoDB, the details of your access patterns matter when coming up with the best solution. Any detail you can add might help!Seth Geoghegan
It's more of a hypothetical question out of interest, as the amount of data I'm dealing with is pretty small, But I see now, from your comment and Charles's answer, that with DynamoDB there is no one way to do things. Would you say that with DynamoDB the usage patterns potentially have more of an impact on database design than with other database engines?Jonathan Holvey
You're spot on with your observation around usage patterns impacting design of your data model in DynamoDB. You'll often hear the advice when working with DDB to start by listing your access patterns. As you've said, there's no one way to model data in DDB. Rather, it's your access patterns that drive the designSeth Geoghegan

1 Answers

2
votes

How big of a range are you dealing with?

You could for instance have a GSI partition key of
YYYY
or YYYY-MM
or YYYY-MM-DD

Your sort key could be the remainder of the timestamp..

You may need to make multiple queries, if for instance the amount of data necessitates daily partitions and you want to show 7 days at a time.

Also be sure to read the best practices for time-scale data part of the developer guide.