1
votes

My DynamoDB table has around 100 million (30GB) items and I provisioned it with 10k RCUs. I'm using a data pipeline job to export the data.

The DataPipeline Read Throughput Ratio set to 0.9.

How do I calculate the time for the export to be completed (The pipeline is taking more than 4 hrs to complete the export)

How can I optimize this, so that export completes in less time.

How does the Read Throughput Ratio relate to DynamoDB export?

1
If you have point in time recovery activated, there is a much easier solution available now, see the news blog - Maurice

1 Answers

0
votes

The answer to this question addresses most of your questions in regard to estimating the time for the Data Pipeline job to complete.

There is now a much better solution to export data from DynamoDB to S3, which was announced in November 2020. There is now a way to do that from DynamoDB directly without provisioning an EMR Cluster and tons of RCUs.

Check out the documentation for: Exporting DynamoDB table data to Amazon S3