2
votes

Although I am setting high RUs, I am not getting required results.

Background is: I am working on IOT application and unfortunately partition key set is very bad {deviceID}+ {dd/mm/yyyy hh:mm:sec:}, which means technically speaking each logical partition would have very less items (never reach 10 GB limit), but I feel there is a huge number of physical partitions got created which is forcing my RUs to split. How do I get physical partition list

3
Your partition key seems fine from a write point of view. It will have no bearing on number of physical partitions created. This will depend on size of data in them and when they need to split. But I am not sure that it will support your reading queries. Searching either for all events from a specific device or for all recent events in general would both require a cross partition fan out. You should probably ask a different question about the best partitioning strategy for your case - Martin Smith

3 Answers

0
votes

you cant control partitions, nor you can get a partition list. but you dont actually need them. its not like each partition will be placed on a separate box. if you are suffering from low performance you need to identify what is causing throttling. You can use the metrics blade to identify throttled partitions and figure out why those are throttled. You can also use diagnostic settings and stream those to Log Analytics to gain additional insights

0
votes

We can get the list of partition key ranges using this API. Partition Key Ranges might change in future with changes in data.

Physical partitions are internal implementations. We don't have any control over the size or number of physical partitions and we can't control the mapping between logical & physical partitions.

But we can control the distribution of data over logical partitions by choosing appropriate Partition Key which can spread data evenly across multiple logical partitions.

0
votes

This information used to be displayed straight forwardly in the portal but this was removed in a redesign.

I feel that this is a mistake as provisoning RU requires knowledge of peak RU per partition multiplied by number of partitions so this number should be easily accessible.

The information is returned in the JSON returned to the portal but not shown to us. For collections provisioned with dedicated throughput (i.e. not using database provisioned throughput) this javascript bookmark shows the information.

javascript:(function () {     var ss = ko.contextFor($(".ext-quickstart-tabs-left-margin")[0]).$rawData().selectedSection();      var coll = ss.selectedCollectionId();      if (coll === null) {         alert("Please drill down into a specific container");     }     else {         alert("Partition count for container " + coll + " is " + ss.selectedCollectionPartitionCount());     } })();

Visit the metrics tab in the portal and select the database and container and then run the bookmark to see the count in an alert box as below.

enter image description here

You can also see this information from the pkranges REST end point. This is used by the SDK. Some code that works in the V2 SDK is below

var documentClient = new DocumentClient(new Uri(endpointUrl), authorizationKey,
 new ConnectionPolicy {
  ConnectionMode = ConnectionMode.Direct
 });

var partitionKeyRangesUri = UriFactory.CreatePartitionKeyRangesUri(dbName, collectionName);
FeedResponse < PartitionKeyRange > response = null;

do {
 response = await documentClient.ReadPartitionKeyRangeFeedAsync(partitionKeyRangesUri,
  new FeedOptions {
   MaxItemCount = 1000
  });
 foreach(var pkRange in response) {
  //TODO: Something with the pkRange
 }

} while (!string.IsNullOrEmpty(response.ResponseContinuation));