Why isn't SQL Server using my clustered index and doing a non-clustered index scan?

Question

I have a patient table with a few columns, and a clustered index on column ID and a non-clustered index on column birth.

create clustered index CI_patient on dbo.patient (ID)
create nonclustered index NCI_patient on dbo.patient (birth)

Here are my queries:

select * from patient 

select ID from patient

select birth from patient

Looking at the execution plan, the first query is 'clustered index scan' (which is understandable because the table is a clustered table), the third one is 'index scan nonclustered' (which is also understandable because this column has a nonclustered index)

My question is why the second one is 'index scan nonclustered'? This column suppose to have a clustered index, in this sense, should that be clustered index scan? Any thoughts on this?

How large is the table? The query plans on small tables don't mean much. — Gordon Linoff
@GordonLinoff It is a pretty small table, or a test table. I created it to practice sql index. — Kay
A non-clustered index also contains the clustered index key as the row locater. This is ID - so it is more efficient to scan that index. Change the clustered index to something else and you will likely see that plan change to the clustered index. — SMor
@SMor, would you please further explain why if the column is ID it would more efficient to be non-clustered index scan? Thanks! — Kay
@Kay, the leaf nodes of the non-clustered index contain only birth(the key column) and id (row locator) as @SMor mentioned. The leaf nodes of the clustered index contain all columns in the table. Although both contain the requested id column, the optimizer chose the non-clustered index because is faster to scan the smaller of the 2 indexes. Full scans (e.g. no WHERE clause) typically use the narrowest index with all the columns needed by the query. — Dan Guzman

marc_s marc_s · Accepted Answer · 2020-04-25T05:31:31

Basically, your second query wants to get all ID values from the table (no WHERE clause or anything).

SQL Server can do this two ways:

clustered index scan - basically a full table scan to read all the data from all rows, and extract the ID from each row - would work, but it loads the WHOLE table, one by one
do a scan across the non-clustered index, because each non-clustered index also includes the clustering column(s) on its leaf level. Since this is a index that is much smaller than the full table, to do this, SQL Server will need to load fewer data pages and thus can provide the answer - all ID values from all rows - faster than when doing a full table scan (clustered index scan)

The cost-based optimizer in SQL Server just picks the more efficient route to get the answer to the question you've asked with your second query.

Why isn't SQL Server using my clustered index and doing a non-clustered index scan?

1 Answers