0
votes

I'm new to DynamoDB and trying to figure out how to structure my data/table/index. My schema includes an itemid (unique) and an orderid (multiple items per order), along with some other arbitrary attributes. I want to be able to retrieve a single item by its itemid, but also retrieve a set of items by their OrderId.

My initial instinct was to set the itemid as the primary key and the orderid as the sort key, but that didn't allow me to query by orderid only. However the same problem occurs if I reverse those.

Example data:

ItemId OrderId
abc-123 1234
def-345 1234
ghi-678 5678
jkl-901 5678

I think I may need a Global Se but not quite understanding where those fit.

2
It sounds like you will need two indexes: docs.aws.amazon.com/amazondynamodb/latest/developerguide/…Mark B

2 Answers

0
votes

If your question is really whether you "are able" to do this, then with ItemId as the partition key, you can still retrieve by OrderId, with the Scan operation, which will let you filter by any attribute.

However Scan will perform full table scans, so the real question is probably whether you can retrieve by OrderId efficiently. In that case, you would indeed need a Global Secondary Index with OrderId and ItemId as the composite attribute key.

0
votes

This is typically achieved using what's called a "single table design". What this means, is that you store all your data in one table, and store it normalized, i.e. duplicate your data so that it fits your access patterns.

Generally speaking, if you do not know your access patterns beforehand, dynamodb might not be a good fit. For many systems, a good solution is to have the "main" access patterns in dynamo and then offloading some not so performance critical ad-hoc queries by replicating data to something like elasticsearh.

If you have a table with the hash key PK (String) and the sort key SK (String), you can store your data like this. Use transactions to keep the multiple items up to date and consistent etc.

PK SK shippingStatus totalPrice cartQuantity
order_1234 order_status PENDING 123123
order_1234 item_abc-123 3
order_1234 item_def-345 1
order_5678 order_status SHIPPED 54321
order_5678 item_jkl-901 5
item_abc-123 order_1234
item_abc-123 order_9876
item_abc-123 order_5656

This table illustrates the schemaless nature of a dynamo table (except from the PK/SK). With this setup, you can store "metadata" about the order in the order_1234/order_status item. Then, you can query for items with PK order_1234 and SK starts_with "item_" to get all the items for that order. You can do the same to get all the orders for an item - query for PK item_abc-123 and SK starting with "order_" to get all the orders.

I highly recommend this talk by Rick Houlihan to get into single table design and data modelling in dynamo :)

https://www.youtube.com/watch?v=HaEPXoXVf2k