I made a table with 1346 items, each item being less than 4KB in size. I provisioned 1 read capacity unit, so I'd expect on average 1 item read per second. However, a simple scan of all 1346 items returns almost immediately.
What am I missing here?
This is likely down to burst capacity in which you gain your capacity over a 300 second period to use for burstable actions (such as scanning an entire table).
This would mean if you used all of these credits other interactions would suffer as they not have enough capacity available to them.
You can see the amount of consumed WCU/RCU via either CloudWatch metrics or within the DynamoDB interface itself (via the Metrics tab).
You don't give a size for your entries except to say "each item being less than 4KB". How much less?
1 RCU will support 2 eventually consistent reads per second of items up to 4KB.
To put that another way, with 1 RCU and eventually consistent reads, you can read 8KB of data per second.
If you records are 4KB, then you get 2 records/sec
1KB, 8/sec
512B, 16/sec
256B, 32/sec
So the "burst" capability already mentioned allowed you to use 55 RCU. But the small size of your records allowed that 55 RCU to return the data "almost immediately"
There are two things working in your favor here - one is that a Scan
operation takes significantly fewer RCUs than you thought it did for small items. The other thing is the "burst capacity". I'll try to explain both:
The DynamoDB pricing page says that "For items up to 4 KB in size, one RCU can perform two eventually consistent read requests per second.". This suggests that even if the item is 10 bytes in size, it costs half an RCU to read it with eventual consistency. However, although they don't state this anywhere, this cost is only true for a GetItem
operation to retrieve a single item. In a Scan
or Query
, it turns out that you don't pay separately for each individual item. Instead, these operations scan data stored on disk sequentially, and you pay for the amount of data thus read. If you 1000 tiny items and the total size that DynamoDB had to read from disk was 80KB, you will pay 80KB/4KB/2, or 10 RCUs, not 500 RCUs.
This explains why you read 1346 items, and measured only 55 RCUs, not 1346/2 = 673.
The second thing working in your favor is that DynamoDB has the "burst capacity" capability, described here:
DynamoDB currently retains up to 5 minutes (300 seconds) of unused read and write capacity. During an occasional burst of read or write activity, these extra capacity units can be consumed quickly—even faster than the per-second provisioned throughput capacity that you've defined for your table.
So if your database existed for 5 minutes prior to your request, DynamoDB saved 300 RCUs for you, which you can use up very quickly. Since 300 RCUs is much more than you needed for your scan (55), your scan happened very quickly, without throttling.
When you do a query, the RCU count applies to the quantity of data read without considering the number of items read. So if your items are small, say a few bytes each, they can easily be queried inside a single 4KB RCU.
This is especially useful when reading many items from DynamoDB as well. It's not immediately obvious that querying many small items is far cheaper and more efficient than BatchGetting them.
aws dynamodb scan --table-name witness_library_events --return-consumed-capacity TOTAL
and gotConsumedCapacity: CapacityUnits: 55.0
– bgoosman