DynamoDB partition key choice for notes app

Question

I want to create a DynamoDB table that allows me to save notes from users.

The attributes I have:

user_id
note_id (uuid)
type
text

The main queries I will need:

Get all notes of a certain user
Get a specific note
Get all notes of a certain type (the less used query)

I know that in terms of performance and DynamoDB partitions note_id would be the right choice because they are unique and would be distributed equally over the partitions but on the other hand is much harder to get all notes of a user without scanning all items or using a GSI. And if they are unique I suppose it doesn't make any sense to have a sort key.

The other option would be to use user_id as partition key and note_id as sort key, but if I have certain users that are a much larger number of notes than others wouldn't that impact my performance?

Is it better to have a partition key unique (like note_id) to scale well with DynamoDB partitions and use GSIs to create my queries or to use instead a partition key for my main query (user_id)?

Thanks

Do your two secondary searches really happen independently of knowing the user_id? I found your question whilst in the middle of doing some complex key design of my own and it seems possible you are over-generalising the scenarios. — Andy Dent

smcstewart smcstewart · Accepted Answer · 2017-11-14T18:47:08

Possibly the simplest and most cost-effective way would be a single table:

Table Structure

note_id (uuid) / hash key
user_id
type
text

Have two GSIs, one for "Get all notes of a certain user" and one for "Get all notes of a certain type (the less used query)":

GSI for "Get all notes of a certain user"

user_id / hash key
note_id (uuid) / range key
type
text

A little note on this - which of your queries is the most frequent: "Get all notes of a certain user" or "Get a specific note"? If it's the former, then you could swap the GSI keys for the table keys and vice-versa (if that makes sense - in essence, have your user_id + note_id as the key for your table and the note_id as the GSI key). This also depends upon how you structure your user_id - I suspect you've already picked up on; make sure your user_id is not sequential - make it a UUID or similar.

GSI for "Get all notes of a certain type (the less used query)"

type / hash key
note_id (uuid) / range key
user_id
text

Depending upon the cardinality of the type field, you'll need to test whether a GSI will actually be of benefit here or not.

If the GSI is of little benefit and you need more performance, another option would be to store the type with an array of note_id in a separate table altogether. Beware of the 400k item limit with this one and the fact that you'll need to perform another query to get the text of the note.

With this table structure and GSIs, you're able to make a single query for the information you're after, rather than making two if you have two tables.

Of course, you know your data best - it's best to start with what you think is best and then test it to ensure it meets what you're looking for. DynamoDB is priced by provisioned throughput + the amount of indexed data stored so creating "fat" indexes with many attributes projects, as above, if there is a lot of data then it could become more cost effective to perform two queries and store less indexed data.

DynamoDB partition key choice for notes app

3 Answers