2
votes

Trying to get my head around this example from AWS to map a relational model to nosql

https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/bp-modeling-nosql-B.html

A key concept highlighted there is:

Important

.... Most well-designed applications require only one table. ...

Given that, the example table is as follows

dynamodb table

It explains,

You define the following entities, which support the relational order entry schema:

HR-Employee - PK: EmployeeID, SK: Employee Name

HR-Region - PK: RegionID, SK: Region Name

...

However, the entity HR-Employee - PK: EmployeeID, SK: Employee Name in the example table has SK values which are not Employee Names.

Also, it suggests the following query

enter image description here

but GSI-1 doesn't have PK of Employee Name.

I understand this could be a discrepancy in AWS documentation and I should raise it with them (which I have and they are notoriously bad in following up) but what I'm not sure is if the documentation is correct and my understanding is wrong (I'm inclined to believe the latter as AWS documentation is generally accurate).

Can someone guide me in the right direction in terms of a nosql schema mapping ? A correct example (with sample records of the dynamo table) for the schema in the above link would be much valued.

1

1 Answers

1
votes

So I'll try and make this clearer for you, let me know if something still doesn't make sense.

To start, you mention the fact that:

However, the entity HR-Employee - PK: EmployeeID, SK: Employee Name in the example table has SK values which are not Employee Names.

The reason there are SK values that are not "Employee Names" is because the SK isn't only for "Employee Names", it is also used by other queries (such as Region Name, Country Name, etc.). Think of the SK as exactly what it stands for, a sort key. The documentation seems to miss the explanation of the extra SK they have, so let me summarize what you're looking at.

You have HR-Employee1, with Employee Name = Employee1, QuotaID (guessing what this key is) = QUOTA-2017-Q1, Some Other Key = HR-CONFIDENTIAL

These key names are not actually defined in the table, they all go under the sort key, and are only implicitly "employee name" or "quota id" or "region name".

What this allows you to do is query the employees data, using employeeID as PK and employee name as SK, but it also lets you query the employee Quota data (or whatever it is) by using employeeID as PK and quotaID as SK.

The same applies to your second question, concerning GSI-1. In essence, the way they have designed the table in this scenario is you have an SK "SortKey" where you can have various types of values to sort on, if that makes sense.