0
votes

We have a write-heavy data so much that we are constantly experiencing RATE LIMIT on our application from COSMOS (mongo API) and we just not able to keep up with the pace of data that we have to insert then the rate of an insert we are seeing using COSMOS.

First, we already have Auto Scale Enable the RU is currently set to 55000 we might change it to serverless but before I need to understand the how COSMOS physical partition and logical partition understanding and whether the partition key selection is correct

So Cosmos states that

Maximum RUs per (logical) partition 10000

We Partition data on hourly rate example(this is done because we are planning to filter on date for our read request)

2020-09-17 00:00:00  -> 1 logical parition
2020-09-17 01:00:00  -> 2 logical partition
2020-09-17 02:00:00  -> 3 logical partition

and so on.

Now it's mentioned that in CosmosDB.

If we provision a throughput of 18,000 request units per second (RU/s), then each of the three physical partition can utilize 1/3 of the total provisioned throughput. Within the selected physical partition, the logical partition keys Beef Products, Vegetable and Vegetable Products, and Soups, Sauces, and Gravies can, collectively, utilize the physical partition's 6,000 provisioned RU/s.

The physical partition is something that is internal to COSMOS DB given in the above scenario but this is something (mention above) is puzzling me

So my questions are?

If our script is inserting record for shared key

2020-09-18 00:00:00  
  1. Will 2020-09-18 00:00:00 logical partition gets the full 51000 RU or 10000 RU as mentioned by COSMOS.

  2. If we have 100 physical partition does the RU is shared among all the 100 partitions equally(strictly) even though the other Physical partition is not Serving any RU.

3

3 Answers

0
votes

It sounds like all that's happening with your hourly partition key is that all the writes are rotated to a new hot (bottleneck) partition each hour. Since one partition is limited to 10K RU as you note, that would be your system's effective write throughput at any given time.

A different partitioning strategy would be needed to distribute writes, like those discussed on the synthetic partition key docs. If you have a some other candidate partitioning value (even if random suffix) to add to or replace the timespan value, that would allow multiple parallel write partitions and thus provide increased throughput.

0
votes

Partitioning on date/time is probably one of the worst partition keys you can choose for a write heavy workload because you will always have a hot partition for the current time.

10K RU/s is the limit for a physical partition, not a logical one.

I would strongly recommend a new partition key that does a better job of distributing writes across a wider partition key range. If you can query your data using that same partition key value or at least a range of values such that it is bounded in some way and not a complete fan out query, you will be in much better shape.

0
votes

Based on our recent project experience where we faced something similar in our CosmosDB and the conversations we had with MSFT's cosmos team

  1. Will 2020-09-18 00:00:00 logical partition gets the full 51000 RU or 10000 RU as mentioned by COSMOS.

The distribution of RUs takes place based on number of physical partitions and in case your provisioned throughput is 55000 RU then there will be 6 partitions created internally by Cosmos (as 1 physical partition can have a max of 10000 RU provisioned to it) and each partition will be provisioned same amount of RUs. So, 2020-09-18 00:00:00 logical partition will get RUs equal to that provisioned to one physical partition in which resides.

  1. If we have 100 physical partition does the RU is shared among all the 100 partitions equally(strictly) even though the other Physical partition is not Serving any RU.

Yes the RU is shared among all the 100 partitions equally (strictly) even though other physical partition is not serving any RU

Found this MS doc which talks about the same.