3
votes

I have a DynamoDB with ~16M records where each record is of size 4k. The table is configured for autoscaling Target utilization: 70%, Minimum provisioned capacity for Reads: 250 and Maximum provisioned capacity for Writes: 3000.

I am trying to setup data pipeline to backup DynamoDB to S3. The pipeline configuration asks for Read Throughput Ratio which is 0.25 by default.

So the question is how to compute Read Throughput Ratio to back up the table in ~1 Hours. I understand the read capacity units. How is the Read Throughput Ratio related to Read Capacity Units and Auto Scaling Configuration?

1

1 Answers

1
votes

Theoretically an RCU is 4KB so if you divide your data volume by 4KB you will get total RCU required for reading the complete data for the given second. So if you divide this value by 60*60 ( Minutes*Seconds) for 1 hour you will get the required RCU configuration but take into account the time required to setup EMR cluster. But I am confused on how this will behave if auto scaling is configured to the particular table.