20
votes

I have a database in SQL Azure which is not taking between 15 and 30 minutes to do a simple:

select count(id) from mytable

The database is about 3.3GB and the count is returning approx 2,000,000 but I have tried it locally and it takes less than 5 seconds!

I have also run a:

ALTER INDEX ALL ON mytable REBUILD

On all the tables in the database.

Would appreciate if anybody could point me to some things to try to diagnose/fix this.

(Please skip to UPDATE 3 below as I now think this is the issue but I still do not understand it).

UPDATE 1: It appears to take 99% of the time in a clustered index scan as image below shows. I have

enter image description here

UPDATE 2: And this is what the statistics messages come back as when I do:

SET STATISTICS IO ON
SET STATISTICS TIME ON
select count(id) from TABLE

Statistics:

SQL Server parse and compile time: 
   CPU time = 0 ms, elapsed time = 0 ms.

 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.
SQL Server parse and compile time: 
   CPU time = 0 ms, elapsed time = 317037 ms.

 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.

 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.

(1 row(s) affected)
Table 'TABLE'. Scan count 1, logical reads 279492, physical reads 8220, read-ahead reads 256018, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

(1 row(s) affected)

 SQL Server Execution Times:
   CPU time = 297 ms,  elapsed time = 438004 ms.
SQL Server parse and compile time: 
   CPU time = 0 ms, elapsed time = 0 ms.

 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.

UPDATE 3: OK - I have another theory now. The Azure portal is suggesting each time I do test this simply select query it is maxing out my DTU percentage to nearly 100%. I am using a Standard Azure SQL instance with performance level S1 (20 DTUs). Is it possible that this simple query is being slowed down by my DTU limit?

3
Have you checked to make sure there is no deadlock?Craig
Not formally but I am pretty sure there is not - I have turned off any updates so my query is the only thing which should be hitting the DB.chrisb
Q: So did you ever resolve your Azure performance problem?FoggyDay
@FoggyDay Not yet, I think I have narrowed it to be something to do with my understanding of DTUs but need to find time to investigate further. I can't understand how a simply count(id) can max out my quota so I must be missing something.chrisb
The main answer is that DTU is a terrible metric because some queries are IO bound and others CPU bound, and DTU is some black box "blend" of both. Upgrade to a higher performance tier, but even then you may be throttled... Seems to be a design choice by MS - it's about small transactions, not aggregation or analytics.N West

3 Answers

10
votes

I realize this is old, but I had the same issue. I had a table with 2.5 million rows that I imported from an on-prem database into Azure SQL and ran at S3 level. Select Count(0) from Table resulted in a 5-7 minute execution time vs milliseconds on-premise.

In Azure, index and table scans seem to be penalized tremendously in performance, so adding a 'useless' WHERE to the query that forces it to perform an index seek on the clustered index helped.

In my case, this performed almost identical Select count(0) from Table where id > 0 resulted in performance matching the on premise query.

3
votes

Suggestion: try select count(*) instead: it might actually improve the response time:

Also, have you done an "explain plan"?

============ UPDATE ============

Thank you for getting the statistics.

You're doing a full table scan of 2M rows - not good :(

POSSIBLE WORKAROUND: query system table row_count instead:

http://blogs.msdn.com/b/arunrakwal/archive/2012/04/09/sql-azure-list-of-tables-with-record-count.aspx

select t.name ,s.row_count from sys.tables t
join sys.dm_db_partition_stats s
ON t.object_id = s.object_id
  and t.type_desc = 'USER_TABLE'
  and t.name not like '%dss%'
  and s.index_id = 1
2
votes

Quick refinement of @FoggyDay post. If your tables are partitioned, you'll want to sum the rowcount.

SELECT t.name, SUM(s.row_count) row_count
FROM sys.tables t
JOIN sys.dm_db_partition_stats s
ON t.object_id = s.object_id
  AND t.type_desc = 'USER_TABLE'
  AND t.name not like '%dss%'
  AND s.index_id = 1
GROUP BY t.name