0
votes

I have a large procedure that performs lots of updates and selects on a table with about 50 million records in and the query plan shows that there are missing indexes that would help:

Missing Index (Impact 89.4367): CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>] ON [dbo].[myTable] ([Category],[Quarter],[COLA]) INCLUDE ([UserId],[Price])

Missing Index (Impact 90.7279): CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>] ON [dbo].[myTable] ([Category],[Quarter],[COLB]) INCLUDE ([UserId],[Price])

Missing Index (Impact 90.4069): CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>] ON [dbo].[myTable] ([Category],[Quarter][COLC]) INCLUDE ([UserId],[Price])

Missing Index (Impact 90.6373): CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>] ON [dbo].[myTable] ([Category],[Quarter],[COLD]) INCLUDE ([UserId],[Price])

Missing Index (Impact 88.774): CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>] ON [dbo].[myTable] ([Category],[Quarter],[COLE]) INCLUDE ([UserId],[Price])

Missing Index (Impact 89.9133): CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>] ON [dbo].[myTable] ([Category],[Quarter],[COLF]) INCLUDE ([UserId],[Price])

Missing Index (Impact 90.1297): CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>] ON [dbo].[myTable] ([Category],[Quarter],[COLF]) INCLUDE ([UserId],[Price])

Missing Index (Impact 76.6554): CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>] ON [dbo].[myTable] ([Category],[Quarter],[COLG]) INCLUDE ([UserId],[Price])

Missing Index (Impact 90.9105): CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>] ON [dbo].[myTable] ([Category],[Quarter],[COLH]) INCLUDE ([UserId],[Price])

Missing Index (Impact 84.1814): CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>] ON [dbo].[myTable] ([Category],[Quarter],[COLI]) INCLUDE ([UserId],[Price])

Missing Index (Impact 89.3511): CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>] ON [dbo].[myTable] ([Category],[Quarter],[COLJ]) INCLUDE ([UserId],[Price])

Missing Index (Impact 90.3087): CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>] ON [dbo].[myTable] ([Category],[Quarter],[COLK]) INCLUDE ([UserId],[Price])

Missing Index (Impact 90.6367): CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>] ON [dbo].[myTable] ([Category],[Quarter],[COLL]) INCLUDE ([UserId],[Price])

Missing Index (Impact 75.6598): CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>] ON [dbo].[myTable] ([Category],[Quarter],[COLM]) INCLUDE ([UserId],[Price])

Missing Index (Impact 82.8915): CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>] ON [dbo].[myTable] ([Category],[Quarter],[COLN]) INCLUDE ([UserId],[Price])

Missing Index (Impact 88.2316): CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>] ON [dbo].[myTable] ([Category],[Quarter],[COLO]) INCLUDE ([UserId],[Price])

Missing Index (Impact 81.9138): CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>] ON [dbo].[myTable] ([Category],[Quarter],[COLP]) INCLUDE ([UserId],[Price])

Missing Index (Impact 80.1902): CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>] ON [dbo].[myTable] ([Category],[Quarter],[COLP]) INCLUDE ([UserId],[Price])

My question is should I:

  1. Create all of the indexes it suggests

  2. Create a covering index with all of the columns e.g.

CREATE NONCLUSTERED INDEX indexA 
ON [dbo].[myTable] ([Category], [Quarter], [COLA], [COLB], [COLC], [COLD], [COLE], [COLF], 
                    [COLG], [COLH], [COLI], [COLJ], [COLK], [COLL], [COLM], [COLN], [COLO], [COLP])  
INCLUDE ([UserId],[Price])  
  1. Create an index with all of the columns in the includes e.g
CREATE NONCLUSTERED INDEX indexB 
ON [dbo].[myTable] ([Category], [Quarter])   
INCLUDE ([COLA], [COLB], [COLC], [COLD], [COLE], [COLF], [COLG], [COLH], [COLI],
         [COLJ], [COLK], [COLL], [COLM], [COLN], [COLO], [COLP], [UserId], [Price])
  1. Some other combination?

I understand there may not be an exact answer to this but which would you recommend and why?

1
Without information about the actual queries you run, everything is supposition. People write entire books on this topic.MatBailie
I understand, but in reality there are around 250 inserts and a similar number of selects that look for various issues and I couldn't feasibly include them all here so a theoretical answer is all that's required.feijoc

1 Answers

0
votes

The main problem in adding indexes is that they slow down data update operations (inserts, deletes, updates) as the indexes also need to be updated.

While this answer is opinion-based, there is one main consideration imo - How often is the table updated (e.g., new inserts/updates/deletes)?

  • If there are no updates ever (or only manual) then make as many indexes as desired - this can be done for reference tables etc. But this is not a reference table imo.
  • Alternatively, if it's a data warehouse table with a data load then complete index rebuild daily, it is also viable
  • On the other hand, if it has a lot of updates e.g., transactions loaded one at a time, or in small batches, then more indexes will reduce the performance of those
    • For this reason, I am typically very averse to having many non-clustered indexes (and too-wide indexes) on transaction tables

Note with an index, you can only have a certain number of bytes in the main index (though the include fields can be longer). Therefore an index on all fields will probably not be viable. Furthermore, having all those fields as part of the index probably won't help - when it needs (say) ColB, the data is already sorted by ColA - therefore it will need to read the whole index again anyway.

Personally, I would suggest starting with an index on [Category],[Quarter],[COLA] then include all the other fields. You can replace ColA with any of the other columns if they're used more frequently. It provides

  • a good index for the ColA part, as it provides all the info needed (covering index) and is sorted correctly, and
  • a covering index for the other parts

It will still need to do index scans, but it's worth trying. See if that speeds up your query to an acceptable level.

Note, though, if you already have

  • The clustered index on [Category],[Quarter] and
  • Little data in other fields (e.g., no varchar(2000) fields in the table but not used by this

... then the non-clustered index may not help much.

In a similar vein, depending what else you may do with the table, if the clustered index is not on [Category],[Quarter] then you may want to make it that. Note though that if anything else uses this table, you could be forcing poor performance on that.