3
votes

I'm using Azure Table Storage from my PHP application and seeing very slow response times. The application gathers reports each day for each user and places them in a partition using the partition key $userID . date("dmY"). During load testing one particular day had several thousand reports generated which requires multiple round trips to fetch due to the 1,000 entity limit in an Azure Table Storage Query. Each trip to fetch 1,000 entities can take up to 2 seconds.


Code:

for($i = 0; $i < $daysToGoBack; $i++)
{
    $filter = "PartitionKey eq '" . $userId . date("dmY", strtotime("-$i days")) . "'";

    $options = new QueryEntitiesOptions();
    $options->addSelectField('created');
    $options->setFilter(Filter::applyQueryString($filter));

    $this->benchmark->mark('main_query');

    $result = $this->tableRestProxy->queryEntities('reports', $filter, $options);

    $this->benchmark->mark('main_query_end');
    echo "Query: " . $this->benchmark->elapsed_time('main_query', 'main_query_end') . "<br/>";

    $entities = array_merge($result->getEntities(), $entities);

    $nextPartitionKey = $result->getNextPartitionKey();
    $nextRowKey = $result->getNextRowKey();

    while(!is_null($nextRowKey) && !is_null($nextPartitionKey)) 
    {
        $options = new QueryEntitiesOptions();
        $options->setNextPartitionKey($nextPartitionKey);
        $options->setNextRowKey($nextRowKey);
        $options->addSelectField('created');
        $options->setFilter(Filter::applyQueryString($filter));

        $this->benchmark->mark('sub_query');

        $newResult = $this->tableRestProxy->queryEntities('reports', $options);

        $this->benchmark->mark('sub_query_end');
        echo "Continuation: " . $this->benchmark->elapsed_time('sub_query', 'sub_query_end') . "<br/>";

        $newEntities = $newResult->getEntities();

        $entities = array_merge($newEntities, $entities);

        $nextPartitionKey = $newResult->getNextPartitionKey();
        $nextRowKey = $newResult->getNextRowKey();
    }


Result:

Query: 1.8183
Continuation: 1.2479
Continuation: 0.2423
Continuation: 0.2619
Continuation: 0.2476
Continuation: 0.2836
Continuation: 0.2345
Continuation: 0.2482
Continuation: 0.2565
Continuation: 0.2187
Continuation: 0.2319
Continuation: 0.2389
Continuation: 0.2221
Query: 0.0320
Query: 0.0338
Query: 0.1038
Query: 0.1263
Query: 0.1841
Query: 0.0547


The results above are about the best I could get out of it. The first query has almost 13,000 reports to pull back so does the initial query and then 12 calls back with the continuation token. Most of the time the queries can take much longer on exactly the same data.

Query: 1.8273
Continuation: 1.2592
Continuation: 0.8160
Continuation: 0.8463
Continuation: 0.7474
Continuation: 0.7104
Continuation: 1.3987
Continuation: 1.4321
Continuation: 1.4526
Continuation: 1.3184
Continuation: 0.7390
Continuation: 0.7212
Continuation: 0.2610
Query: 0.0630
Query: 0.1221
Query: 0.0728
Query: 0.1250
Query: 0.1717
Query: 0.0568

Are these results expected from Azure or is there a more efficient way to query this data out?

1
How much raw data (size in MB) is coming back with these queries? Also, I'm not a PHP guy so maybe your code clearly states it but does your filter contain anything other than the partition key? I see you calling a function to enhance it. If you indeed filter on non partition key fields, that could be your problem.Jaxidian
The query filter is simply the partition key, I want the entire contents of the partition. I limit the fields of the entities being returned to just the 'created' field to limit the size of the return, it's just a time stamp "1430229469". This makes it about as small as is possible, roughly 10Kb for the data I'm asking for per query/continuation.Scott Helme
Are you running in Azure or locally?Matthew Steeples
I was running it on DigitalOcean servers which were on the US West Coast, the same as my Azure region. That said, I did fire up an Azure Web App and it only shaved off about 5% being in the same data centre, which I imagine was the latency of going external.Scott Helme

1 Answers

1
votes