

DynamoDB tables with a primary key that is a composite hash-range key are unique. Does this extend to secondary indices too?


I have a comments DynamoDB table with a post_id primary key and comment_id range key. Additionally there's a local secondary index with a date-user_id range key.

Each entry is a comment a user has left on post. The purpose of the secondary index is to count how many unique users left a comment on a post on a specific day.

Entry 1: post_id: 1 comment_id: 1 date-user_id: 2014_06_24-1

Entry 2: post_id: 1 comment_id: 2 date-user_id: 2014_06_24-1

Entry 3: post_id: 1 comment_id: 3 date-user_id: 2014_06_24-2

When I do a query specifying the secondary index, and pass in a condition of post_id equals 1 and a date-user_id equals 2014_06_24-1, I'm getting a count of 2 and I'm expecting a count of 1.

Why does the secondary index have two entries with the same primary key/range key.


3 Answers


Each item in a Local Secondary Index (LSI) has a 1:1 relationship with the corresponding item in the table. In the example above, while entry 1 and entry 2 in the LSI have the same range key value, the item in the table they point to is different. Hence Index keys ( hash or hash+range) are not unique.

Global Secondary Index (GSI) are similar to LSI in this aspect. Every GSI item contains the table hash and range keys (of the corresponding item). More details are available at http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/GSI.html#GSI.Projections


Secondary indexes don't guarantee uniqueness. From the docs:

In addition, remember that global secondary indexes do not enforce uniqueness



NO they don't. Indexes are updated asynchronously, meaning they'll be eventually consistent, and which also means that dynamodb won't be able to enforce uniqueness at the time when you make the update call (it won't check for uniqueness on the secondary indexes, as that's an async operation; if it does, it will have no way to return a failure, as the real-time call would already have finished).

On a side note, that's also the reason why you can only perform Scan or Query on a GSI index, but not GetItem (i.e. GetItem is expected to return one item, but there can be many corresponding to given secondary index in the absence of uniqueness constraint).