1
votes

I am having almost 300 threads trying to access a single DynamoDB table with very low allocated read/write capacity (5/sec). So the load is clearly too high, and I am getting ProvisionedThroughputExceededException from DynamoDBMapper. The question is: how do I cause DynamoDBMapper to wait+retry rather than fail?

I am using ExponentialBackoffStrategy on AmazonDynamoDBClient with long delays (100ms base, 12 retries, so this yields: 100ms, 200ms, 400ms,... 409600ms), but it just doesn't work. In fact, it fails immediately. According to the docs:

ProvisionedThroughputExceededException

Message: You exceeded your maximum allowed provisioned throughput for a table or for one or more global secondary indexes. To view performance metrics for provisioned throughput vs. consumed throughput, open the Amazon CloudWatch console.

Example: Your request rate is too high. The AWS SDKs for DynamoDB automatically retry requests that receive this exception. Your request is eventually successful, unless your retry queue is too large to finish. Reduce the frequency of requests, using exponential backoff.

What am I doing wrong?

What does "too large to finish" mean?

Thanks!

In case you are wondering why I am using 300 threads, low allocated capacity and can accept long response times: normally we use higher capacities, I just don't want the application to fail if the capacity is low.

1
5 req/sec for 300 threads is very very less capacity and no matter how much backoff you use there will hell lot of requests in retry queue and there is no documentation how many requests can be in retry queue , so in your case IMO its crossing that number very frequently , hence you are getting this exception immediately not for the latest request in the queue but for the old request in queue. Best solution would be to increase your provisioned throughput to some reasonable number as 5 is too less for 300 threads. - user156327
@AmitK: no, I don't get it, sorry. With 300 threads and 409600 ms delay per request per thread, you are getting less than 1 request per second (on average). And the retry queue length cannot exceed the number of threads. Right? Or maybe this is caused by exponential backup time interval collissions. Shall I use a randomized jitter? - STF

1 Answers

0
votes

Ok I had a wrong retry condition. Then you also need to adjust the maximal number of retries, and possibly the delays (max, base).