1
votes

Working in a datawarehouse environment (SQL Server 2008) and there are a few tables that have about 2 million rows and 20 columns. Each night, the tables are dropped and re-created. When they are made, indexes are also built. For some reason there is no clustered index on these tables. There are however, unique nonclustered indexes. Seems unlogical. Does anybody know any downsides to changing that all so that all these tables have a clustered index. Should save some space and perform better.

Any tips?

Thanks in advance.

1
AFAIK all tables in SQL Server have a clustered index, whether or not you declare one as such. - Tim Biegeleisen
@TimBiegeleisen: I don't think that is true. If you create the PK as "nonclustered", SQL Server will not create a hidden clustered index. Are you maybe confusing that with MySQL? - a_horse_with_no_name
@a_horse_with_no_name By default SQL Server will create a clustered index in the background (see here). Are you certain that this can be turned off, other than by creating another named index? - Tim Biegeleisen
@TimBiegeleisen create table foo (id integer primary key nonclustered) - a_horse_with_no_name
@a_horse_with_no_name OK but that still creates an index. I think I phrased my first comment wrongly. - Tim Biegeleisen

1 Answers

2
votes

Indeed clustered indexes have drawbacks too.

I think the most under-estimated drawback is what I refer to as clustered index penalty.

If you don't have any clustered index on a table, it means the table is stored as a heap table. All non-clustered indexes refer to that heap table.

The good thing about heap tables is that rows stored in them hardly ever move — unlike clustered indexes, in which every row can move at any time to a different physical location.

This difference affects non-clustered indexes as they refer to the rows in the heap or clustered index: in case of a heap, they can just store the physical location of that row in the non-clustered indexes (as they hardly ever change). If there is a clustered index, the non-clustered indexes store the clustering key.

If you are eventually using a non-clustered index, the effort to reach the actual table in either the heap or clustered index is very different: With a heap, you just need one physical IO, with a clustered index, you need to do a Clustered Index Seek, which is typically 3-5 logical IOs (depends on the table size).

If you have many non-clustered indexes and are using them without index-only scan (that means there follows a RID access), a clustered index may hurt performance considerably.

More details about that in this article I wrote:

http://use-the-index-luke.com/blog/2014-01/unreasonable-defaults-primary-key-clustering-key