1
votes

My team is migrating several different DynamoDB tables into a set of two DynamoDB tables. Essentially we're changing the model and consolidating a lot of the data into just two tables (right now it's spread across 6).

We were considering using DynamoDB streams and having those trigger a Lambda that deals with the logic of triggering some API that writes to the new tables. However, I am trying to figure out how to deal with the old data as well, since we'll have to migrate everything already in the table to the new tables also.

I'm guessing one option is to have something running that scans every item in the DynamoDB table (there's around 100 million in some), and for each, call the same API that the Lambda was calling to have the new table written to. However, I'm not sure how that scan of the old table handles when we also have new records being written to old table pretty regularly during the migration.

Does anyone have some advice on doing such a migration and keeping things in sync?

Thanks!

1

1 Answers

4
votes

Streams & lambda are a good solution (by Abhaya Chauhan)...
(also a video on pluralsight)

In this situation, try leveraging DynamoDB Streams and AWS Lambda to remodel data as needed.

A nice way to restructure your table definition is to leverage DynamoDB triggers, following these steps:

  1. Create a new table (let us call this NewTable), with the desired key structure, LSIs, GSIs.

  2. Enable DynamoDB Streams on the original table

  3. Associate a Lambda to the Stream, which pushes the record into NewTable. (This Lambda should trim off the migration flag in Step 5)
  4. [Optional] Create a GSI on the original table to speed up scanning items. Ensure this GSI only has attributes: Primary Key, and Migrated (See Step 5).
  5. Scan the GSI created in the previous step (or entire table) and use the following Filter: FilterExpression = "attribute_not_exists(Migrated)" Update each item in the table with a migrate flag (ie: “Migrated”: { “S”: “0” }, which sends it to the DynamoDB Streams (using UpdateItem API, to ensure no data loss occurs).

NOTE You may want to increase write capacity units on the table during the updates.

  1. The Lambda will pick up all items, trim off the Migrated flag and push it into NewTable.
  2. Once all items have been migrated, repoint the code to the new table
  3. Remove original table, and Lambda function once happy all is good.