3
votes

The AWS DynamoDB documentation includes an example schema for a forum. However, the number of questions this schema is able to answer seems very small. In addition, the table seems to suffer from a hot-key problem (a burst of replies backs up on the same partition).

In a talk title "Advanced Design Patterns for Amazon DynamoDB" the presenter around 43 minutes breaks down a complex use-case from Audible using only a single table with 3 GSI (indexes).

I'm trying to learn proper DynamoDB modeling coming from a standard RDBMS 3NF background. How would a forum be designed to prevent hot-partitions while still meeting these common use-cases?

Queries:

  • Topics by Forum (sorted by date posted, or most recent reply)
  • Replies by Topic (sorted by date posted with pagination)
  • Replies by User (sorted by date posted)
  • Topics by User (sorted by date posted)
  • Topics with most votes

Basic Schema(?):

  • Forum: Partition key: Forum_GUID. Attributes: Name, Desc
  • User: Partition key: User_GUID. Attributes: email, join_date
  • Thread: Composite key: Forum_GUID, Topic_GUID. Attributes: posted_by, date, votes, body, subject
  • Reply: Composite key: Topic_GUID, Reply_GUID. Attributes: posted_by, date, votes, body

I'm assuming there are multiple solutions (including using a single table). I'm looking for any answer that can solve this while providing guidance on when, and how, to properly use indexes to scale out an application's writes.

1
Not every use case is appropriate for DynamoDB. Some things belong in relational databases.Michael - sqlbot
To avoid "hot partitions" you can add a random suffix/calculated suffix to your partition key to expand the space. docs.aws.amazon.com/amazondynamodb/latest/developerguide/…Sasank Mukkamala

1 Answers

3
votes

enter image description here

You can use the above schema. Now for you queries

  1. Topics by Forum (sorted by date posted, or most recent reply)

    Select from GSI2 where GSI2 pk=Forum123 and sortby GSI2 SK
    

    you can choose whom to keep in GSI2 Sk recent reply/date posted based on which use case is frequently asked.

  2. Replies by Topic (sorted by date posted with pagination)

    Select where pk=topic and sk startswith reply and sortby sk
    
  3. Replies by User (sorted by date posted)

    Select from GSI2 where pk=User123 and sk startswith reply and sortby sk
    
  4. Topics by User (sorted by date posted)

    Select from GSI2 where pk=User123 and sk startswith topic and sortby sk
    
  5. Topics with most votes
This will require another GSI if you want to do this operation across multiple forums. but This GSI will certainly suffer from hot key issue. since there will be only one key. Instead of doing that, you can keep one fixed key value in your table who keeps these counts. and these values are updated by an async process.