I am modelling the data of my application to use DynamoDB. My data model is rather simple:
- I have users and projects
- Each user can have multiple projects
Users can be millions, project per users can be thousands.
My access pattern is also rather simple:
- Get a user by id
- Get a list of paginated users sorted by name or creation date
- Get a project by id
- get projects by user sorted by date
My single table for this data model is the following:
I can easily implement all my access patterns using table PK/SK and GSIs, but I have issues with number 2. According to the documentation and best practices, to get a sorted list of paginated users:
- I can't use a scan, as sorting is not supported
- I should not use a GSI with a PK that would put all my users in the same partition (e.g. GSI PK = "sorted_user", SK = "name"), as that would make my single partition hot and would not scale
- I can't create a new entity of type "organisation", put all users in there, and query by PK = "org", as that would have the same hot partition issue as above
I could bucket users and use write sharding, but I don't really know how I could practically query paginated sorted users, as bucket PKs would need to be possibly random, and I would have to query all buckets to be able to sort all users together. I also thought that bucket PKs could be alphabetical letters, but that could crated hot partitions as well, as the letter "A" would probably be hit quite hard.
My application model is rather simple. However, after having read all docs and best practices and watched many online videos, I find myself stuck with the most basic use case that DynamoDB does not seem to be supporting well. I suppose it must be quite common to have to get lists of users in some sort of admin panel for practically any modern application.
What would others would do in this case? I would really want to use DynamoDB for all the benefits that it gives, especially in terms of costs.
Edit
Since I have been asked, in my app the main use case for 2) is something like this: https://stackoverflow.com/users?tab=Reputation&filter=all. As to the sizing, it needs to scale well, at least to the tens of thousands.
query
operation to paginate – Balu Vyamajala