I'm building a DynamoDB app that will eventually serve a large number (millions) of users. Currently the app's item schema is simple:
{
userId: "08074c7e0c0a4453b3c723685021d0b6", // partition key
email: "[email protected]",
... other attributes ...
}
When a new user signs up, or if a user wants to find another user by email address, we'll need to look up users by email
instead of by userId
. With the current schema that's easy: just use a global secondary index with email
as the Partition Key.
But we want to enable multiple email addresses per user, and the DynamoDB Query
operation doesn't support a List
-typed KeyConditionExpression
. So I'm weighing several options to avoid an expensive Scan
operation every time a user signs up or wants to find another user by email address.
Below is what I'm planning to change to enable additional emails per user. Is this a good approach? Is there a better option?
- Add a sort key column (e.g.
itemTypeAndIndex
) to allow multiple items peruserId
.
{
userId: "08074c7e0c0a4453b3c723685021d0b6", // partition key
itemTypeAndIndex: "main", // sort key
email: "[email protected]",
... other attributes ...
}
- If the user adds a second, third, etc. email, then add a new item for each email, like this:
{
userId: "08074c7e0c0a4453b3c723685021d0b6", // partition key
itemTypeAndIndex: "Email-2", // sort key
email: "[email protected]"
// no more attributes
}
The same global secondary index (with
email
as the Partition Key) can still be used to find both primary and non-primary email addresses.If a user wants to change their primary email address, we'd swap the
email
values in the "primary" and "non-primary" items. (Now that DynamoDB supports transactions, doing this will be safer than before!)If we need to delete a user, we'd have to delete all the items for that
userId
. If we need to merge two users then we'd have to merge all items for thatuserId
.The same approach (new items with same
userId
but different sort keys) could be used for other 1-user-has-many-values data that needs to beQuery
-able
Is this a good way to do it? Is there a better way?