0
votes

I am trying to understand the relationship between Physical/Logical partitions and throughput availability in Azure Cosmos DB and have a question about the throughput available to each logical partition.

Is the throughput available for a physical partition split evenly amongst its logical partitions or is it randomly distributed in the sense that any logical partition can use 0 - 100% of the throughput available to the physical partition?

The reason I ask this is because I am seeing conflicting answers.

  1. In this Cosmos DB Conf presentation - Partitioning Tips for Azure Cosmos DB to Increase Performance and Save Money, the presenter mentioned that throughput available for a physical partition is evenly distributed amongst all logical partitions inside that physical partition (or at least that's what I inferred).

  2. However the documentation I referenced, mentions the following (emphasis mine).

If you provision a throughput of 18,000 request units per second (RU/s), then each of the three physical partition can utilize 1/3 of the total provisioned throughput. Within the selected physical partition, the logical partition keys Beef Products, Vegetable and Vegetable Products, and Soups, Sauces, and Gravies can, collectively, utilize the physical partition's 6,000 provisioned RU/s.

From the documentation it seems the size or utilization of a logical partition does not really matter and I could have some logical partitions getting more requests than others but as long as I am not exceeding the available throughput of the physical partition, I should be fine. Is this correct?

P.S. This is part 2 of the question I posted here: Some questions about Cosmos DB Physical and Logical Partitions.

1

1 Answers

1
votes

Is the throughput available for a physical partition split evenly amongst its logical partitions or is it randomly distributed in the sense that any logical partition can use 0 - 100% of the throughput available to the physical partition?

The throughput is equally distributed amongst the physical partitions. Within a physical partition, it is NOT distributed equally amongst the logical partitions. So each logical partition can use from 0-100% of the throughput assigned for that physical partition. And if the total utilization of the physical partition goes beyond 100%, that is when you will see throttling errors.

From the documentation it seems the size or utilization of a logical partition does not really matter and I could have some logical partitions getting more requests than others but as long as I am not exceeding the available throughput of the physical partition, I should be fine. Is this correct?

This is kind of true. The logical partition size does matter, meaning it can't be more than 20GB. The utilization is also limited to 10K RU/s. We have no control on how the logical partitions are split into the physical partitions so there is no real way for you to know on which physical partition your logical partitions lie in. Similarly there is no means to ensure that you don't exceed the 10K throughput of a physical partition. This is why MS recommends that you choose your partition key so the utilization is balanced appropriately.