4
votes

I have read the dynamodb best practises and I am trying to design a table that allows me to query it in a flexible manner and avoid scans.

Here is the best design that I could get:

Table: Project
Hash: ClientId#ProjectId
Range: Name
Other attributes:

  • CreatedAt(timestamp)
  • Tags (TagId1,TagId2,TagId3)
  • Type

Note that I am putting multiple values in hash in range keys, with a "#" as a delimiter, as to have larger query capabilities.
My actual schema allows me to query the table in the following ways:

  • Get all the projects of a client(operator: BEGIN_WITH)
  • Get a specific project by its id knowing the client id(operator: EQ)
  • Order the results by Name
  • Filter the results on the Name

I want to allow my users to also filter on the "tags" attribute in addition to the already available filterings.
I also need my users to order by CreationAt instead of name but still be able to filter on the name.

Here is an example query that I can't do with the following scheme:

Get all the projects of a given Client, ordered by "CreationAt", containing a specific "TagId", and a given name and a type.

Another query, maybe simplier, that I can't achieve:
Get all the projects of a client containing a given substring for the name part, and a given type.

Any comments on my actual sheme ?
How to optimize it ?
How to design my secondary local indexes or custom index tables to be able to achieve my requests ?
Is there any way to reproduce the "LIKE" comparison operator of mysql ?

I'm beginning to ask myself if dynamodb is well suited for my project.

Thank you for your inputs.

1

1 Answers

2
votes

I think you are confusing the hash and range keys.

You should have clientId as the hash and projectId as the range. This will allow you to quickly get all projects of a client (QUERY) and fast row retrieval of a project (GET).

The rest of the conditions you want to query about might be reasonable with DynamoDB conditions, however - if the number of projects isn't large for a specific client, it will probably be easier to move the logic to the application layer after using QUERY.

Edit:

Regarding complex queries: With secondary indices you won't be able to combine queries with AND / OR between conditions.