I am seeking some clarity around the primary key choices for a DynamoDB table that my company's already implemented. The table consists of the following attributes:
WeekStartingOn STRING (represents the Monday of the week the transaction posted)
TransactionID STRING (UUID - unique)
VendorID STRING (UUID - unique by vendor)
dccAmount NUMBER
pointOfSaleTime STRING (Storing UNIX timestamp)
TerminalID NUMBER (UUID)
The table's primary key definition:
weekStartingON PRIMARY PARTITION KEY
TransactionID PRIMARY SORT KEY
Current GSI's: vendorIDIndex
VendorID PARITITON KEY
pointOfSaleTime SORT KEY
Sample data:
The main query type is:
For a vendor, show all the transactions in the past day, week, month, year, etc.
I believe the thinking behind the current layout is to group all of the past week's transactions contiguously, then from there, select the vendor's transactions.
I'm sure that this design is not right. Using weekStartingOn as the partition key will result in hot keys, as most vendors will want to look at, for example, everything since weekStartingOn = 2016-12-05. Also, sorting by transactionID doesn't make any sense.
I would be more inclined to have the base table primary key defined as per the vendorIDIndex, i.e.
VendorID PARTITION KEY
pointOfSaleTime SORT KEY
Even so, I still have a couple of problems with this design. Some of our Vendors are much larger than others and will make the distribution of reads/writes across partitions unbalanced. For example, VendorA may have 500000 daily transactions, but VendorB may have only 10 daily transactions. Also, I am not entirely convinced that a combination of VendorID and pointOfSaleTime is guaranteed to be unique.
Or, slightly more complicated, and would require work for the devs:
1 - Randomise the VendorID by adding a suffix, i.e. -1
2 - Depending on the number of suffixes, query the VendorID + Suffix, X amount of times
3 - Merge the results
I think I like the last option, but most difficult to implement considering where we currently are.
What would be the best design for this?
Many thanks