We have reserved various number of RUs per second for our various collections. I'm trying to optimize this to save money. For each response from Cosmos, we're logging the request charge property to Application Insights. I have one analytics query that returns the average number of request units per second and one that returns the maximum.
let start = datetime(2019-01-24 11:00:00);
let end = datetime(2019-01-24 21:00:00);
customMetrics
| where name == 'RequestCharge' and start < timestamp and timestamp < end
| project timestamp, value, Database=tostring(customDimensions['Database']), Collection=tostring(customDimensions['Collection'])
| make-series sum(value) default=0 on timestamp in range(start, end, 1s) by Database, Collection
| mvexpand sum_value to typeof(double), timestamp limit 36000
| summarize avg(sum_value) by Database, Collection
| order by Database asc, Collection asc
let start = datetime(2019-01-24 11:00:00);
let end = datetime(2019-01-24 21:00:00);
customMetrics
| where name == 'RequestCharge' and start <= timestamp and timestamp <= end
| project timestamp, value, Database=tostring(customDimensions['Database']), Collection=tostring(customDimensions['Collection'])
| summarize sum(value) by Database, Collection, bin(timestamp, 1s)
| summarize arg_max(sum_value, *) by Database, Collection
| order by Database asc, Collection asc
The averages are fairly low but the maxima can be unbelievably high in some cases. An extreme example is a collection with a reservation of 1,000, an average used of 15,59 and a maximum used of 63,341 RUs/s.
My question is: How can this be? Are my queries wrong? Is throttling not working? Or does throttling only work on a longer period of time than a single second? I have checked for request throttling on the Azure Cosmos DB overview dashboard (response code 429), and there was none.