0
votes

I have a table MY_TABLE with approximately 9 million rows.

There are in total of 38 columns in this table. The columns that are relevant to my question are:

  • RECORD_ID: identity, bigint, with unique clustered index
  • RECORD_CREATED: datetime, with non-unique & non-clustered index

Now I run the following two queries and naturally expect the first one to execute faster because the data is being sorted by a column that has a unique clustered index but somehow it executes 271 times(!) slower.

SELECT TOP 1 
    RECORD_ID 
FROM 
    MY_TABLE 
WHERE 
    RECORD_CREATED >= '20140801' 
ORDER BY 
    RECORD_ID

SELECT TOP 1 
    RECORD_ID 
FROM 
    MY_TABLE 
WHERE 
    RECORD_CREATED >= '20140801' 
ORDER BY 
    RECORD_CREATED

The execution times are 1630 ms and 6 ms, respectively.

Please advise.

P.S.: Due to security policies of the environment I cannot see the execution plan or use SQL Profiler.

1
The second one is just a straight index seek. Read first row greater or equal to the seek point and possible single lookup if the index isnt covering. And in your case the index will be covering as the NCI always includes the CI key so no lookup needed.Martin Smith
@MartinSmith, I am really sorry, but I didn't understand your comment completely. Could you please explain a bit more? An answer is also welcome.anar khalilov
I won't bother submitting an answer as I'm not on a PC. In general RECORD_CREATED >= '20140801 can be satisfied by a range seek. Finding the relevant point in the index then scanning it along. Because you just want the top 1 in index order SQL Server can stop the range seek as soon as it has read the first row. Your select list is actually on a different column record_id but that is also silently included in the non clustered index as it is the clustered index key.Martin Smith

1 Answers

2
votes

SQL Server has a few choices to make about how to perform this query. It could begin by sorting all the items, leveraging the indexes you mentioned, and then follow that up by filtering out any items that don't match the WHERE clause. However, it's typically faster to cut down on the size of the data set that you're working with first, so you don't have to sort as many items.

So SQL Server is most-likely choosing to perform the WHERE filter first. When it does this, it most likely starts by using the non-unique, non-clustered index on RECORD_CREATED to skip over all the items where RECORD_CREATED is less than '20140801', and then take all the items after that.

At that point, all the items are pre-sorted in the order in which they were found in the RECORD_CREATED index, so the second query requires no additional effort, but the first query then has to perform a sort on the records that have been chosen.