0
votes

I have a DynamoDB table in which I store link data (url, date, category, tags etc).

I need to be able to -

  • lookup items by url (check url doesn't already exist)
  • find all items stored after a given date

Based on the above I set up the schema with url as primary hash key and secondary index on date, as follows -

AWSTemplateFormatVersion: '2010-09-09'
Parameters:
  TableName:
    Type: String
    Default: "my_links"
  HashAttr:
    Type: String
    Default: url
  IndexAttr:
    Type: String
    Default: date
  ReadCapacity:
    Type: Number
    Default: 5
  WriteCapacity:
    Type: Number
    Default: 5
Resources:
  Table:
    Properties:
      KeySchema:
        - AttributeName: !Ref HashAttr
          KeyType: HASH
      AttributeDefinitions:
        - AttributeName: !Ref HashAttr
          AttributeType: S
        - AttributeName: !Ref IndexAttr
          AttributeType: S
      GlobalSecondaryIndexes:
        - IndexName: !Ref IndexAttr
          KeySchema:
            - AttributeName: !Ref IndexAttr
              KeyType: HASH
          Projection:
            ProjectionType: ALL
          ProvisionedThroughput:
            ReadCapacityUnits: !Ref ReadCapacity
            WriteCapacityUnits: !Ref WriteCapacity
      ProvisionedThroughput:
        ReadCapacityUnits: !Ref ReadCapacity
        WriteCapacityUnits: !Ref WriteCapacity
      TableName: !Ref TableName
    Type: AWS::DynamoDB::Table

I can query the table by date as follows, but only using an eq condition -

ddb=boto3.resource("dynamodb")
table=ddb.Table("my_links")
from boto3.dynamodb.conditions import Key
queryexp=Key('date').eq("2020-02-19")
for item in table.query(IndexName="date",
                        KeyConditionExpression=queryexp)["Items"]:
    print (item)    

If I use a gte in place of eq condition I get the following -

botocore.exceptions.ClientError: An error occurred (ValidationException) when calling the Query operation: Query key condition not supported

I can however query the table using a scan and a gte condition -

filterexp=Key('date').gte("2020-02-19")
for item in table.scan(FilterExpression=filterexp)["Items"]:
    print (item)

But then I'm guessing I don't need the secondary index any more and also that this will get very expensive as the table get large :-/

So I'd rather stick with a secondary index and query if possible (am I thinking about this right ?), but what do I need to do to the schema to be able to fetch all items after a date ?

2

2 Answers

0
votes

You can't use GTE. Queries support EQ | LE | LT | GE | GT | BEGINS_WITH | BETWEEN

Check this article https://medium.com/cloud-native-the-gathering/querying-dynamodb-by-date-range-899b751a6ef2

0
votes

So this was useful -

https://stackoverflow.com/a/38790120/124179

and also this -

How to query DynamoDB by date (range key), with no obvious hash key?

In the end the answer was to denormalised the data, replacing the date field (and associated secondary index) with a week field, then searching for one or more specific weeks using eq query condition and joining the results (I only need a couple of weeks data)

Obviously could replace week with month for increased range but reduced granularity.