1
votes

This question concerns the Performance Target thresholds in Azure Table Storage that are described here. In particular, I'm concerned about the 2K entities per partition, and 20K entities total, per second, threshold. My understanding is that if you go over these thresholds, Azure may begin to return 500 and 503 errors.

I think I may be running into this, and I'm looking for clarification about exactly how these items are counted. I understand that if you perform scans of either the full table or a partition, you will potentially be "charged" for each entity that Azure scans, regardless of the number you return. Since you could burn your entire query "budget" on a single poorly designed query of a 2000-row table, I want to make sure I correctly understand this.

For the sake of clarity, here's a simple table and same sample data that we can use in the next few scenarios.

PartitionKey | RowKey | Name
----------------------------
A              1        Alice
A              2        Bob
A              3        Candice
A              4        Dave
A              5        Eugenia
B              6        Frank
B              7        Genevieve
B              8        Henry
C              9        Ike
C              10       Jennifer

For each of the following scenarios, I'll call the amount of entities that are "charged" against the Performance Target "budget" the Performance Charge. Let's start from the most egregious offense and go from there.

No PK, no RK specified
Query: Name == "Frank"
Performance Charge: 10 entities

I believe this is correct (please do correct me if I'm wrong). But what about the following scenarios?

PK specified, no RK specified
Query: PartitionKey == "A" AND Name == "Frank"
Performance Charge: 5 entities

PK specified, RK range specified
Query: PartitionKey == "A" AND (RowKey >= "2" AND RowKey <= "3")
Performance Charge: 2? or 5?

PK specified, RK partial range specified
Query: PartitionKey == "A" AND RowKey >= "4" Performance Charge: 2? or 5?

PK range specified
Query: PartitionKey >= "B" AND PartitionKey <= "C"
Performance Charge: 5? or 10?

Partial PK range specified
Query: PartitionKey >= "C"
Performance Charge: 2? or 10?

Thank you in advance for any pointers!

1

1 Answers

0
votes

Per my experience, here is my understanding of your queries, you could refer to it.

No PK, no RK specified Query: Name == "Frank"

Full table scan: Query results in scanning the entire table (i.e. all rows in all partitions in a table).

Performance Charge:10

PK specified, no RK specified Query: PartitionKey == "A" AND Name == "Frank"

PK specified, RK range specified Query: PartitionKey == "A" AND (RowKey >= "2" AND RowKey <= "3")

PK specified, RK partial range specified Query: PartitionKey == "A" AND RowKey >= "4"

Row range scan: Query results in scanning the rows within the specific partition.

Performance Charge:5

Query: PartitionKey >= "B" AND PartitionKey <= "C"

Query: PartitionKey >= "C"

Partition range scan: Query results in scanning a range of partition servers.

Performance Charge:5, 2

In addition, you could follow Designing a Scalable Partitioning Strategy for Azure Table Storage and this tutorial for a better understanding of it.