Azure SQL Timeout after query inactivity

Question

I'm experiencing frequent timeouts on my managed Azure SQL Instance. When no query activity have occurred for some time, the first query, which is expected to return between 500-2000 rows, will use all the available DTUs in my pricing tier (S2 50 DTU plan) and always result in the following exception:

System.Data.SqlClient.SqlException (0x80131904): Timeout expired.  The timeout period elapsed prior to completion of the operation or the server is not responding. ---> System.ComponentModel.Win32Exception (258): The wait operation timed out
   at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)
   at System.Data.SqlClient.SqlInternalConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)
   at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose)
   at System.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady)
   at System.Data.SqlClient.SqlDataReader.TryConsumeMetaData()
   at System.Data.SqlClient.SqlDataReader.get_MetaData()
   at System.Data.SqlClient.SqlCommand.FinishExecuteReader(SqlDataReader ds, RunBehavior runBehavior, String resetOptionsString)
   at System.Data.SqlClient.SqlCommand.RunExecuteReaderTds(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, Boolean async, Int32 timeout, Task& task, Boolean asyncWrite, SqlDataReader ds)
   at System.Data.SqlClient.SqlCommand.RunExecuteReader(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, TaskCompletionSource`1 completion, Int32 timeout, Task& task, Boolean asyncWrite, String method)
   at System.Data.SqlClient.SqlCommand.ExecuteReader(CommandBehavior behavior)
   at System.Data.SqlClient.SqlCommand.ExecuteDbDataReader(CommandBehavior behavior)
   at Microsoft.EntityFrameworkCore.Storage.Internal.RelationalCommand.Execute(IRelationalConnection connection, DbCommandMethod executeMethod, IReadOnlyDictionary`2 parameterValues)
ClientConnectionId:903324e4-4eba-4522-bae8-228a23c0e51c
Error Number:-2,State:0,Class:11
ClientConnectionId before routing:24ca7fb2-4c3b-44ad-b393-d8cda9dda172

When executing the same query right after, the server respons in no time. Probably because the recent query has been loaded in-memory. The SQL Server is used in a IoT setup, so data streaming (Bulk inserts) from devices occurs constantly through the day. I tried solving the problem by letting an Azure Function performing a query each hour to load frequently accessed data into the memory, but this will only solve the problem for specific Entity I'm querying in the Azure Function. According to the execution plan, all my queries are using the correct indexes.

I don't think implementing a Retry Policy is a acceptable solution. From my perspective, this will only decrease the User Experience even more. Does anyone have any experience with this issue?

EDIT: Excecution plan here: https://www.brentozar.com/pastetheplan/?id=rJQvb7tRm

Perhaps query/index tuning is the solution. Upload your actual execution plan and add the link to your question. — Dan Guzman
Please try increasing the timeout value and/or using a larger reservation size. Compiling queries can take time to get plans into the procedure cache. During that process, statistics may also get updated (which are sampled but it can take incrementally more time). Once you figure out your threshold, you can try things like async stats update to tune your app to your desired cost level. — Conor Cunningham MSFT
@DanGuzman and Conor thanks for replying. I've included the execution plan in the post. — Mortenkp25
@Mortenkp25, glad it worked for you. I added the comment as an answer to better help others too. — Dan Guzman

Dan Guzman Dan Guzman · Accepted Answer · 2018-11-26T22:32:11

The actual exeuction plans shows 1187 rows were read by the clustered index seek on the DeviceUniqueIdentifier clustered index key even though no rows satisfied the other criteria. Adding InstanceId and NVEControllerTimestamp to the clustered index key will avoid touching those rows needlessly.

Generally, the best index to support a trivial query that returns all columns is a clustered index with the equality predicate key columns first (DeviceUniqueIdentifier and InstanceId) followed by the inequality columns (range search on NVEControllerTimestamp).

You can use CREATE INDEX...WITH (DROP_EXISTING = ON) to add the additional columns to the existing clustered index.

Azure SQL Timeout after query inactivity

1 Answers