NOTE: I give details of my Azure setup here, but I'm not sure the solution will be an Azure-based one. This may be a problem that can be resolved at the C#, Entity Framework, or SQL level.
I have a .NET web application running on Azure App Service using Entity Framework to access an Azure SQL DB at pricing level Standard S1 (20 DTU). 99% of the time, the app is utilizing less than 1% of DTU on the SQL DB. However, when someone logs into the Admin Portal of the app and runs a particular report, it executes a query that is very resource intensive and takes a very long time - over a minute - which we can't live with. This report is run only a few times a week. I've tried scaling the SQL DB up and have found - unsurprisingly - that at higher plans, the execution time gets to a somewhat reasonable level. At Standard S4 (200 DTU), the execution time drops to 20 seconds, which is not ideal but I can live with for now. However, it doesn't make sense to pay for S4-level when 99% of the time it will be using only a fraction of a percent of DTU. Any ideas on how I can either reduce the query execution time or only scale when needed?
The Entity Framework code used for this report is:
class MyAppModelContainer : DbContext
{
public virtual ObjectResult<GetOrganizationList_Result> GetOrganizationList()
{
return ((IObjectContextAdapter)this).ObjectContext.ExecuteFunction<GetOrganizationList_Result>("GetOrganizationList");
}
}
The model used to retrieve the results is:
public partial class GetOrganizationList_Result
{
public int id { get; set; }
public string Name { get; set; }
public Nullable<int> DeviceCounts { get; set; }
public Nullable<int> EmailCounts { get; set; }
}
The stored procedure is:
CREATE PROCEDURE [dbo].[GetOrganizationList]
AS
BEGIN
SELECT o.Id,o.Name,COUNT(distinct s.DeviceId) as DeviceCounts, COUNT(distinct d.userid) as EmailCounts
FROM Sessions s
INNER JOIN Devices d on d.Id = s.DeviceId
RIGHT OUTER JOIN Organizations o on o.id=s.OrganizationId
GROUP BY o.Id,Name
END
The approximate number of rows in each of the joined tables: Sessions table: 2 million rows Devices table: 166,000 rows Users table: 88,000 rows
Here are the table definitions and indexes:
CREATE TABLE [dbo].[Sessions] (
[Id] INT IDENTITY (1, 1) NOT NULL,
[DeviceId] INT NULL,
[StartTime] DATETIME NOT NULL,
[OrganizationId] INT NOT NULL,
CONSTRAINT [PK_Sessions] PRIMARY KEY CLUSTERED ([Id] ASC),
CONSTRAINT [FK_DeviceSession] FOREIGN KEY ([DeviceId]) REFERENCES [dbo].[Devices] ([Id]),
CONSTRAINT [FK_OrganizationSession] FOREIGN KEY ([OrganizationId]) REFERENCES [dbo].[Organizations] ([Id])
);
CREATE NONCLUSTERED INDEX [IX_FK_DeviceSession]
ON [dbo].[Sessions]([DeviceId] ASC);
CREATE NONCLUSTERED INDEX [IX_FK_OrganizationSession]
ON [dbo].[Sessions]([OrganizationId] ASC);
CREATE NONCLUSTERED INDEX [IX_Sessions_OrganizationId_Include_DeviceId]
ON [dbo].[Sessions]([OrganizationId] ASC)
INCLUDE([DeviceId]);
CREATE NONCLUSTERED INDEX [IX_Sessions_OrganizationId_DeviceId] ON [dbo].[Sessions]
(
[DeviceId] ASC,
[OrganizationId] ASC,
[StartTime] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
CREATE TABLE [dbo].[Devices] (
[Id] INT IDENTITY (1, 1) NOT NULL,
[UserId] INT NULL,
[MACAddress] NCHAR (12) NOT NULL,
CONSTRAINT [PK_Devices] PRIMARY KEY CLUSTERED ([Id] ASC),
CONSTRAINT [FK_UserDevice] FOREIGN KEY ([UserId]) REFERENCES [dbo].[Users] ([Id]),
CONSTRAINT [IX_Unique_MACAddress] UNIQUE NONCLUSTERED ([MACAddress] ASC)
);
CREATE NONCLUSTERED INDEX [IX_FK_UserDevice]
ON [dbo].[Devices]([UserId] ASC);
CREATE TABLE [dbo].[Users] (
[Id] INT IDENTITY (1, 1) NOT NULL,
[Email] NVARCHAR (250) NOT NULL,
[Sex] TINYINT NOT NULL,
[Age] SMALLINT NOT NULL,
[PhoneNumber] NCHAR (10) NOT NULL DEFAULT '' ,
[Name] NVARCHAR(100) NOT NULL DEFAULT '',
CONSTRAINT [PK_Users] PRIMARY KEY CLUSTERED ([Id] ASC),
CONSTRAINT [IX_Unique_Email_PhoneNumber] UNIQUE NONCLUSTERED ([Email] ASC, [PhoneNumber] ASC)
);
I rebuild indexes and update statistics on a weekly basis. Azure SQL DB has no performance recommendations.
Any ideas on how to solve this problem without simply throwing more Azure hardware at it? I'm open to anything including Azure-level changes, SQL changes, code changes. It doesn't appear that there's a consumption model of pricing for Azure SQL DB, which may help me if it existed.