1
votes

I have a patient table with a few columns, and a clustered index on column ID and a non-clustered index on column birth.

create clustered index CI_patient on dbo.patient (ID)
create nonclustered index NCI_patient on dbo.patient (birth)

Here are my queries:

select * from patient 

select ID from patient

select birth from patient 

Looking at the execution plan, the first query is 'clustered index scan' (which is understandable because the table is a clustered table), the third one is 'index scan nonclustered' (which is also understandable because this column has a nonclustered index)

My question is why the second one is 'index scan nonclustered'? This column suppose to have a clustered index, in this sense, should that be clustered index scan? Any thoughts on this?

query

1
How large is the table? The query plans on small tables don't mean much. - Gordon Linoff
@GordonLinoff It is a pretty small table, or a test table. I created it to practice sql index. - Kay
A non-clustered index also contains the clustered index key as the row locater. This is ID - so it is more efficient to scan that index. Change the clustered index to something else and you will likely see that plan change to the clustered index. - SMor
@SMor, would you please further explain why if the column is ID it would more efficient to be non-clustered index scan? Thanks! - Kay
@Kay, the leaf nodes of the non-clustered index contain only birth(the key column) and id (row locator) as @SMor mentioned. The leaf nodes of the clustered index contain all columns in the table. Although both contain the requested id column, the optimizer chose the non-clustered index because is faster to scan the smaller of the 2 indexes. Full scans (e.g. no WHERE clause) typically use the narrowest index with all the columns needed by the query. - Dan Guzman

1 Answers

3
votes

Basically, your second query wants to get all ID values from the table (no WHERE clause or anything).

SQL Server can do this two ways:

  • clustered index scan - basically a full table scan to read all the data from all rows, and extract the ID from each row - would work, but it loads the WHOLE table, one by one
  • do a scan across the non-clustered index, because each non-clustered index also includes the clustering column(s) on its leaf level. Since this is a index that is much smaller than the full table, to do this, SQL Server will need to load fewer data pages and thus can provide the answer - all ID values from all rows - faster than when doing a full table scan (clustered index scan)

The cost-based optimizer in SQL Server just picks the more efficient route to get the answer to the question you've asked with your second query.