Copying only new records from AWS DynamoDB to AWS Redshift

Question

I see there is tons of examples and documentation to copy data from DynamoDB to Redshift, but we are looking at an incremental copy process where only the new rows are copied from DynamoDB to Redshift. We will run this copy process everyday, so there is no need to kill the entire redshift table each day. Does anybody have any experience or thoughts on this topic?

mkobit mkobit · Accepted Answer · 2015-05-01T05:37:41

Dynamo DB has a feature (currently in preview) called Streams:

Amazon DynamoDB Streams maintains a time ordered sequence of item level changes in any DynamoDB table in a log for a duration of 24 hours. Using the Streams APIs, developers can query the updates, receive the item level data before and after the changes, and use it to build creative extensions to their applications built on top of DynamoDB.

This feature will allow you to process new updates as they come in and do what you want with them, rather than design an exporting system on top of DynamoDB.

You can see more information about how the processing works in the Reading and Processing DynamoDB Streams documentation.

Copying only new records from AWS DynamoDB to AWS Redshift

4 Answers