1
votes

I'm having an issue with Dynamo where the read throughput is well below the provisioned capacity without any visible throttling in the graphs.

My table has 100GB of data similar to:

| Partition Key | Sort Key | Value | A | A1 | 1 | A | A2 | 21 | A | A3 | 231 ... | A | A200 | 31 | B | B1 | 5

This structure cannot change too much as it is important that I can query all values associated to a given key ( and more complex queries based on the sort key associated to a given partition key). . This caused me to have throttled writes as it must be hitting the same partitions frequently, but what is really strange is the read throughput. The table has 1000 read units provisioned, but the maximum recorded throughput is 600 reads per second. This is consistent with up to 10.000 provisioned read units per second.

On the client side, I'm sending 1000 requests per second (uniformly, using a rate limiter) so theoretically, the read throughput should be 1000 reads per second. Even if the number of requests is increased on the client-side, the rate stays the same, and there are zero throttled reads.

The client is running on an EC2 m4.2xlarge instance in the same region as Dynamo. I've ruled out an issue with the client as the CPU usage is fairly low, and there is plenty of memory available.

Any thoughts on what could be causing this?

2
So you are saying that your 1000 reads (/sec) all succeed, yet the system counts it as 600?Michael - sqlbot
@Michael-sqlbot If I send 1000 reads per second then yes, they all succeed but it's counted as 600/500 per second, if I send 2000 then no, it does less than 2000 per second, but more than shown on the graphs.Gonçalo

2 Answers

0
votes

The amount of data per item can impact the RCU.

See: http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/HowItWorks.ProvisionedThroughput.html

One read capacity unit represents one strongly consistent read per second, or two eventually consistent reads per second, for an item up to 4 KB in size. If you need to read an item that is larger than 4 KB, DynamoDB will need to consume additional read capacity units. The total number of read capacity units required depends on the item size, and whether you want an eventually consistent or strongly consistent read.

You need to check if you are using consistent reads and the amount of data that you are fetching per read.

0
votes

A few thoughts

  1. In your test are your spreading your queries over all of your partition keys? Dynamo distributes throughput over all partitions, so it you were hitting a subset of partitions you might not achieve your headline throughput.
  2. Do you know how much data each read returns? 1 read capacity can return up to 4KB of data. If some of your results were larger than 4KB you would be getting less than 1000 reads per seconds for 1000 RCUs.
  3. Do you know how many partitions you have on the table and how your throughput is spread over them? A single partition can only have 3,000 RCUs Temporarily increasing throughput can cause your table to generate new partitions - with your throughput spread across each one. Then when you wind the RCUs back down, the data remains in the same number of partitions with your RCUs more thinly spread.