5
votes

Reading into Amazon DynamoDB doc I still can't understand what is the best way of using it for the most common task - having several types of documents (for example, 'user', 'event', 'news') all with unique ids. As I understand, since DynamoDB implies restrictions only on document primary key, we can store in it any data having one. Thus the most natural solution looks like:

- partion key 'type' is document type - 'user', 'event' etc
- sort key is uuid

But this contradicts to official doc, according to it the better one is:

- partition key 'id' is just uuid
- sort key is type - 'user', 'event'

But this contradicts with common sense due to key names. Finally, we can just create 3 different DynamoDB instances for users, events and news, all having uuid as partiton key and no sort key. Which solution is the best or common practice of DynamoDB usage?

2

2 Answers

2
votes

Specifics would be required for definitive statements, but making some assumptions about what your data looks like:

- partion key 'type' is document type - 'user', 'event' etc
- sort key is uuid

Above idea is almost certainly a poor design. You would end up with a few large partitions, leading to performance problems. Additionally I suggest you would end up making Scans over Queries.

- partition key 'id' is just uuid
- sort key is type - 'user', 'event'

Above idea is probably a poor design. The sort key gives you no real benefit. Assuming you need to access users, or events etc separately, you will end up performing Scans.

Its highly likely separate tables for users, events etc would be best. The keys within those tables really would just depend on your data. UUIDs may be a good option, but then it depends on how you want to Query data. Date based attributes, especially for events, often make good sort keys.

I suggest you checkout this answer which is a much longer answer to roughly the same question.

2
votes

This is kind of a subjective question but I'll take a shot and give you some reasons.

1) You can use "uuid:type" as your primary id. The reason you shouldn't use type as the partition id is because all of your data will end up on a single partition for each type. The partitions are how Dynamo and S3 provide parallelization. You would be limiting the ingest and query speeds greatly if you did that.

2) I would personally favor a single table over one per type for pricing reasons. You pay per table read/write capacity. With a single table, you would have less to track and manage. It will be easier to tune the capacity of one table than three.

3) I would not use the sort key in this scenario.