0
votes

I'm running a simple adf pipeline for storing data from data lake to cosmos db (sql api).

After setting database throughput to Autopilot 4000 RU/s, the run took ~11 min and I see 207 throttling requests. On setting database throughput to Autopilot 20,000 RU/s, the run took ~7 min and I see 744 throttling requests. Why is that? Thank you!

enter image description here

1
Autopilot provides, at most, the amount listed, so 4k and 20k in your case. If you're getting throttled it means you still ended up using more RU/s than you had provisioned at the moment. Throttling isn't necessarily a bad thing unless you're having failed requests or want more throughput. If you do see failures or need more throughput, you need to scale up more. You might also evaluate your indexes, which will make your writes cheaper if you can reduce them, and thus save you RUs.Chris Anderson-MSFT
By evaluating indexes, do you mean partition key? Please help me understand.user989988
I don't mean partition key (though that can also be a factor for throttling if you end up with a overloaded partition key). I mean indexes, which, by default, are set to index everything (aka /*). This means you queries will be cheap, but writes will be expensive. You can modify your index policy in your container settings in the portal (or via an SDK). docs.microsoft.com/en-us/azure/cosmos-db/index-policyChris Anderson-MSFT

1 Answers

1
votes

Change the Indexing Policy to None from Consistent for the ADF copy activity and then change back to Consistent when done.

Azure Cosmos DB supports two indexing modes:

  • Consistent: The index is updated synchronously as you create, update or delete items. This means that the consistency of your read queries will be the consistency configured for the account.
  • None: Indexing is disabled on the container. This is commonly used when a container is used as a pure key-value store without the need for secondary indexes. It can also be used to improve the performance of bulk operations. After the bulk operations are complete, the index mode can be set to Consistent and then monitored using the IndexTransformationProgress until complete.

How to modify the indexing policy:

Modifying the indexing policy