2
votes

A social media app where people can leave posts. A post consists of a Author, TimeStamp, Subject and Body. The posts are stored in DynamoDB with Author as partition key and TimeStamp as sort key.

How can I query for all posts of today? I want the results returned sorted on TimeStamp, so scanning DynamoDB is not an option.

The Query operation finds items based on primary key values. So I have to define a global secondary index. But I cannot find a suitable option to retrieve all of today's posts. What is the good choice here?

2

2 Answers

2
votes

You can add an attribute for representing the days (your GSI primary key) and make the timestamp attribute your GSI sort key

Example:

|   AuthorId  | Timestamp(GSI SK) | DayAttribute (GSI PK) |
|:-----------:|------------------:|:---------------------:|
| authord_id  |    1534522921     |     2018-08-17        |    
| authord_id2 |    1534522922     |     2018-08-17        |    
| authord_id3 |    1534522923     |     2018-08-18        |    

When you query GSI it will be sorted by time.

Edit on your comment:

Its not a good approach adding attributes to cover your query needs. What I can suggest in this case is to use Sort Keys hierarchically.

This means combine your most relative queries in a single GSI key and make use of hierarchical sort keys. Lets say you wanna query for only segmented in months weeks, days, hours, minutes..

This would be the table

|   AuthorId  |      Timestamp(GSI SK)      |   MonthAttr (GSI PK)  |
|:-----------:|----------------------------:|:---------------------:|
| authord_id  |    2018:08:17::10:03:25     |       2018-08         |    
| authord_id  |    2018:08:17::10:03:25     |       2018-08         |    
| authord_id  |    2018:08:18::10:03:25     |       2018-08         |    

In this table, by using sort key conditions like begins_with, you can query all items this month, or between days 10 to 15, or specific day between 10 and 12 hours so on..

For instance, for past 13 days query the SortKey condition would be begins_with(2018:08:04:) and past hour query is like begins_with(2018:08:17:10:).

This approach introduces hot partition key problem. Take a look at the Time Series Data model to understand more about this approach and how to deal with it

0
votes

Say if you have table as follows:

resources:
  Resources:
    BotChatHistory:
      Type: 'AWS::DynamoDB::Table'
      DeletionPolicy: Retain
      Properties:
        TableName: ${self:provider.environment.DYNAMODB_BOT_CHAT_HISTORY_TABLE_NAME}
        AttributeDefinitions:
          - AttributeName: messageId
            AttributeType: S
          - AttributeName: userId
            AttributeType: S
          - AttributeName: createdDate
            AttributeType: S
        KeySchema:
          - AttributeName: messageId
            KeyType: HASH
        BillingMode: PAY_PER_REQUEST
        GlobalSecondaryIndexes:
        - IndexName: ${self:provider.environment.DYNAMODB_BOT_CHAT_HISTORY_TABLE_GSI_1}
          KeySchema:
          - AttributeName: userId
            KeyType: HASH
          - AttributeName: createdDate
            KeyType: RANGE
          Projection:
            ProjectionType: ALL 

Then you can query like follows:

/**
 * 
 * getChatHistory
 * 
 */
const getChatHistory = async (organizationId, userId) => {
  return db('query', organizationId, {
    TableName: process.env.DYNAMODB_BOT_CHAT_HISTORY_TABLE_NAME,
    IndexName: process.env.DYNAMODB_BOT_CHAT_HISTORY_TABLE_GSI_1,
    KeyConditionExpression: "userId = :userId",
    ExpressionAttributeValues: {
        ":userId": userId
    },
    ScanIndexForward: true,
  });
}