0
votes

I'm trying to move from SQL Server to DynamoDB and I'm having trouble with queries that have multiple optional where parameters.

My SaaS application has ~30 customers. Each customer has around ~20k orders in the orders table and this keeps growing.

In the web application my customers can filter the orders very granular. This happens server based (DB based) via T-SQL.

In SQL my current solution looks like this:

WHERE 1=1
  AND order.IsDeleted IS NULL
    AND order.ObjectId IN (SELECT ObjectId FROM @allowedObjects)
    AND (@editorId IS NULL OR order.EditorId = @editorId)
    AND (@dealerId IS NULL OR order.DealerId = @dealerId)
    AND (@orderTypeId IS NULL OR order.OrderTypeId = @orderTypeId)
    AND (@objectId IS NULL OR order.ObjectId = @objectId)       
    AND (@read IS NULL OR (@read = 1 AND order.ReadOn IS NOT NULL) OR (@ready = 2 AND order.ReadOn IS NULL))
    AND (@query IS NULL OR (order.Reference LIKE @query OR order.ObjectJson LIKE @query OR order.ObjectDetailsJSON LIKE @query OR order.Title LIKE @query OR CONVERT(nvarchar, activity.ActivityId) LIKE @query))
    AND (@dueDateFrom IS NULL OR CAST(order.DueDate AS DATE) >= CAST(@duaeDateFrom AS DATE))
    AND (@dueDateTo IS NULL OR CAST(order.DueDate AS DATE) <= CAST(@dueDateTo AS DATE))
    AND (@createdOnFrom IS NULL OR CAST(order.CreatedOn AS DATE) >= CAST(@createdOn AS DATE))
    AND (@createdOnTo IS NULL OR CAST(order.CreatedOn AS DATE) <= CAST(@createdOnTo AS DATE))

The customer is able to filter a specific order by multimple optional parameters. So with that you could query down orders between two different creation and due date ranges with a specific order type that have been created by a specific user and are assigned to a specific dealer. Sometimes they just want all open orders of a specific type for a specific dealer.

My PK in DynamoDB ist like customer#83F8B46E-5FD9-4E04-AEB5-73F1AF7182F2

I know that I should avoid attribute filtering in DynamoDB since the number of returned records (what you pay for) only depends on PKs and SKs.

In DynamoDB my first approach was to "overload" (I don't know if this is the right word for concatenating fields in PK/SK) the SK like created_2020_05_05#editor_1234#object_66. But this only allows me to filter

  • all orders by specific editor regardless the object
  • all orders by specific editor for specific object (also excluding that object)
  • the above combined within a creation date range

I could create GSIs with different "overloaded" SKs and could use thise for key and sort key conditions.

Do you know any hints / strategies for my scenario?

Thanks in advance and BR!

1

1 Answers

2
votes

DynamoDB isn't really optimized for these kinds of access patterns, so I'd go about it this way:

  1. Find out the 2-3 attributes that are most widely used in filters
  2. Create a GSI for each of these attributes and do some analysis, which of these filters cuts down the amount of items retrieved the most, i.e. which of these will result in the lowest RCUs being used

When you get a request from the client, you:

  1. Check if any of the most frequent filter conditions you have GSIs for is present
  2. If that's the case, select the filter condition which cuts down most on the amount of results (see above). Go to 4.)
  3. If none of the most frequently used filter conditions is part of the query, you're going to have to select all orders of that customer (worst-case scenario)
  4. Do a Query against either the GSI from 3) or the primary index from 4).
  5. Filter the remaining data client-side or use a FilterExpression in DynamoDB to cut down on the Network Bandwidth being used. The latter won't help you with the RCUs being consumed however, filtering happens after reading.
  6. Return the filtered results

The trade-off here is between storage space + RCU/WCU being used for the GSIs and query speed.