How to perform AWS DynamoDB backup and restore operations by utilizing minimal read/write units?

Question

We are looking for a solution which uses minimum read/write units of DynamoDB table for performing full backup, incremental backup and restore operations. Backup should store in AWS S3 (open to other alternatives). We have thought of few options such as:

1) Using python multiprocessing and boto modules we were able to perform Full backup and Restore operations, it is performing well, but is taking more DynamoDB read/write Units.

2) Using AWS Data Pipeline service, we were able to perform Full backup and Restore operations.

3) Using Dynamo Streams and kinesis Adapter/ Dynamo Streams and Lambda function, we were able to perform Incremental backup.

Are there other alternatives for Full backup, Incremental backup and Restore operations. The main limitation/need is to have a scalable solution by utilizing minimal read/write units of DynamoDb table.

ketan vijayvargiya ketan vijayvargiya · Accepted Answer · 2017-02-02T18:42:51

Option #1 and #2 are almost the same- both do a Scan operation on the DynamoDB table, thereby consuming maximum no. of RCUs.

Option #3 will save RCUs, but restoring becomes a challenge. If a record is updated more than once, you'll have multiple copies of it in the S3 backup because the record update will appear twice in the DynamoDB stream. So, while restoring you need to pick the latest record. You also need to handle deleted records correctly.

You should choose option #3 if the frequency of restoring is less, in which case you can run an EMR job over the incremental backups when needed. Otherwise, you should choose #1 or #2.

How to perform AWS DynamoDB backup and restore operations by utilizing minimal read/write units?

2 Answers