0
votes
GROUPS
    userID: string
    groupID: string
    lastActive: number
    birthday: number

Assume I have a DynamoDB table called GROUPS which stores items with these attributes. The table records which users are joined to which groups. Users can be in multiple groups at the same time. Therefore, the composite primary key would most-commonly be:

partition key: userID
sort key: groupID

However, if I wanted to query for all users in a specific group, within a specific birthday range, sorted by lastActive, is this possible and if so what index would I need to create?

Could I synthesize lastActive and userID to create a synthetic sort key, like so:

GROUPS
    groupID: string
    lastActiveUserID: string (i.e. "20201230T09:45:59-abc123")
    birthday: number

Which would make for a different composite primary key where the partition key is groupID and the sort key is lastActiveUserID, which would sort the participants by when they were last active, and then a secondary index to filter by birthday?

1

1 Answers

1
votes

As written, no this isn't possible.

within a specific birthday range

implies sk_birthday between :start and :end

sorted by lastActive

implies lastActive as a sort key.

which are mutually exclusive...I can't devise a sort key that would be able to contain both values in a usable format.

You could have a Global Secondary Index with a hash key of group-id and lastActive as a sort key, then filter on birthday. But, that only affects the data returned, it doesn't affect the data read nor the cost to read that data. Additionally, since DDB only reads 1MB of data at a time, you'd have to call it repeatedly in a loop if it's possibly a given group has more than 1MB worth of members.

Also, when your index has a different partition (hash) key than your table, that is a global secondary index (GSI). If your index has the same partition key but a different sort key than the table, that can be done with a local secondary index (LSI)

However for any given query, you can only use the table or a given index. You can't use multiple indexes at the same time

Now having said all that, what exactly to you mean by "specific birthday range" If the range in question is a defined period, by month, by week. Perhaps you could have a GSI where the hash key is "group-id#birthday-period" and sort key is lastActive

So for instance, "give me GROUPA birthdays for next month"
Query(hs = "GROUPA#NOVEMBER")

But if you wanted November and December, you'd have to make two queries and combine & sort the results yourself.

Effective and efficient use of DDB means avoiding Scan() and avoiding the use of filterExpressions that you know will throw away lots of the data read.