DynamoDB NoSQL design for queries

Question

I am looking to store a log of user events. It is going to be a lot of entries so I thought DynamoDB would be good as everything else is hosted there.

I need to query these events in two ways, totalt of events for a user for a date (range) and occasionally all the events for a date.

I was thinking to store it in one table as user id (key), sequence number (key), date, time and duration.

Should it be multiple tables? How can this be done most efficient?

What is the sequence number? Is it supposed to be like an auto-increment field just for referencing an event, or what? Another question: would you like to query all events for a date across all users, or all events for a specific user? — xtx
We have the sequence number to make each row unique and we provide this. We only need to query one user at a time. — Gustaf

gkatzioura gkatzioura · Accepted Answer · 2017-02-16T14:59:14

For a small amount of data this structure is ok. Keep in mind that the sequence number (your range key) has to be provided by you. It seems a good idea to choose the date as a unix timestamp with a milliseconds accuracy as a sort key.

There is no need for extra tables. However your structure depends largely on the read write capacity that you want to achieve, and the data size.

Supposing your user_id is your partition key.

For every distinct partition key value, the total sizes of all table and index items cannot exceed 10 GB. A single partition can support a maximum of 3,000 read capacity units or 1,000 write capacity units.

You need to create your partition keys by taking into consideration these limitations. For example a very active user has many events thus you need more than 1000 write capacity units. Unfortunately you have choosen as a partition the user id.

In this case you are limited to 1000 write capacity units therefore you might have failures.

You need to have a different structure. For example a partition name like user_id_1 user_id_2 etc. Therefore a partition naming mechanism spreading the data to partitions according to your application's needs.

Check these links on dynamodb limitations.

Tables guidance, Partition distribution

DynamoDB NoSQL design for queries

2 Answers