0
votes

i'm having some problems with the reliability of Azure SQL servers.

sometimes doing complicated queries with subqueries like the following:

SELECT DISTINCT [DeviceName] ,name ,data.[Addr] ,[Signal] FROM (SELECT [DeviceName] ,[Signal] ,MAX([Signal]) OVER (PARTITION BY [Addr]) AS 'MaxSignal',[Timestamp] ,[Addr] ,[PartitionId] ,[EventEnqueuedUtcTime] FROM [dbo].[mytable] WHERE CAST([Timestamp] AS DATETIME) > DATEADD(HOUR,+2,(DATEADD(MINUTE, -10, GETDATE()))) ) data LEFT JOIN mytable ON [dbo].[myreftable].[Addr] = data.[Addr] WHERE [Signal] = [MaxSignal];

Is done in almost an instant, like i would assume, at other times simply doing a SELECT COUNT(*) FROM mytable Is taking upwards of 30 minutes, and showing a DTU usage graph like this:

DTU usage graph

Anyone know any solutions to this? is it me doing something completely wrong? or is Azure simply not there yet?

1

1 Answers

2
votes

what you pay is what you get.You will need to look at what are the top resources consumers in your system.DTU is nothing, but a limit on CPU,IO,Memory available to your database..

so to troubleshoot DTU problems,I would follow below steps..

1.)Below query gives me Resource usage for last 14 days for all resources..

SELECT 
    (COUNT(end_time) - SUM(CASE WHEN avg_cpu_percent > 80 THEN 1 ELSE 0 END) * 1.0) / COUNT(end_time) AS 'CPU Fit Percent'
    ,(COUNT(end_time) - SUM(CASE WHEN avg_log_write_percent > 80 THEN 1 ELSE 0 END) * 1.0) / COUNT(end_time) AS 'Log Write Fit Percent'
    ,(COUNT(end_time) - SUM(CASE WHEN avg_data_io_percent > 80 THEN 1 ELSE 0 END) * 1.0) / COUNT(end_time) AS 'Physical Data Read Fit Percent'
FROM sys.dm_db_resource_stats

running above query gives you an idea on how much is your CPU percentage consistently

2.) Below query gives me an idea of resource usage over time..

SELECT start_time, end_time,   
  (SELECT Max(v)    FROM (VALUES (avg_cpu_percent), (avg_physical_data_read_percent), (avg_log_write_percent)) AS value(v)) as [avg_DTU_percent] 
FROM sys.resource_stats where database_name = ‘<your db name>’ order by end_time desc

Now that ,i have enough data to look out which metric is more resource intensive,I can follow normal approach of trying to troubleshoot..

Say for Example,if my CPU usage is above 90% consistently over time,I will gather all the queries which are consuming more CPU and try to fine tune them