0
votes

I need to find out the peak read capacity units consumed in the last 20 seconds in one of my dynamo DB table. I need to find this pro-grammatically in java and set an auto-scaling action based on the usage.

Please can you share a sample java program to find the peak read capacity units consumed in the last 20 seconds for a particular dynamo DB table?

Note: there are unusual spikes in the dynamo DB requests on the database and hence needs dynamic auto-scaling.

I've tried this:

result = DYNAMODB_CLIENT.describeTable(recomtableName);
                    readCapacityUnits = result.getTable()
                            .getProvisionedThroughput().getReadCapacityUnits();

but this gives the provisioned capacity but I need the consumed capacity in last 20 seconds.

2

2 Answers

0
votes

You could use the CloudWatch API getMetricStatistics method to get a reading for the capacity metric you require. A hint for the kinds of parameters you need to set can be found here.

0
votes

For that you have to use Cloudwatch.

   GetMetricStatisticsRequest metricStatisticsRequest = new GetMetricStatisticsRequest()
    metricStatisticsRequest.setStartTime(startDate)
    metricStatisticsRequest.setEndTime(endDate)
    metricStatisticsRequest.setNamespace("AWS/DynamoDB")
    metricStatisticsRequest.setMetricName('ConsumedWriteCapacityUnits',)
    metricStatisticsRequest.setPeriod(60)
    metricStatisticsRequest.setStatistics([
        'SampleCount',
        'Average',
        'Sum',
        'Minimum',
        'Maximum'
])
    List<Dimension> dimensions = []
    Dimension dimension = new Dimension()
    dimension.setName('TableName')
    dimension.setValue(dynamoTableHelperService.campaignPkToTableName(campaignPk))
    dimensions << dimension
    metricStatisticsRequest.setDimensions(dimensions)
    client.getMetricStatistics(metricStatisticsRequest)

But I bet you'd results older than 5 minutes.

Actually current off the shelf autscaling is using Cloudwatch. This does have a drawback and for some applications is unacceptable. When spike load is hitting your table it does not have enough capacity to respond with. Reserved with some overload is not enough and a table starts throttling. If records are kept in memory while waiting a table to respond it can simply blow the memory up. Cloudwatch on the other hand reacts in some time often when spike is gone. Based on our tests it was at least 5 mins. And rising capacity gradually, when it was needed straight up to the max enter image description here


Long story short. We have created custom solution with own speedometers. What it does is counting whatever it has to count and changing tables's capacity accordingly. There is a still a delay because

  1. App itself takes a bit of time to understand what to do

  2. Dynamo table takes ~30 sec to get updated with new capacity details.

    On a top we also have a throttling detector. So if write/read request has got throttled we immediately rise a capacity accordingly. Some times level of capacity looks all right but throttling because of HOT key issue.