1
votes

If I am going to be querying a table by Guids (irregardless of fragmentation problems with Guids), would it be faster to have the Guid as the clustered index rather than the non-clustered index or no index at all?

This question is coming from a read-only standpoint. I'm just curious if there will be a speed improvement between the searching rows for a specific Guid, and will searching complete faster with/without an index or with/without a clustered index?

Alternatively, I'm fairly certain in the answer to my next question, but now apply int identifiers to the previous question. Will it be faster to search if the table is clustered by that int? (This is rather than clustered by some other item in the table?)




I know there are many other questions posted on this topic, but I haven't found the specific answer that I'm looking for in any of these:
Should a Sequential Guid primary key column be a clustered index?
Improving performance of cluster index GUID primary key
Clustered primary key on unique identifier ID column in SQL Server
uniqueidentifier with index
Should I get rid of clustered indexes on Guid columns

Thanks for any help!

3
faster than what? I presume the alternative would be a non clustered covering index?Martin Smith
YIKES!! I would avoid GUID's as clustered indices in SQL Server like the devil ! Don't do it - even if searching that GUID is faster that way - most other operations will come to a crawl with GUIDs as CK....marc_s
@Martin Smith - I meant raster than either a non-clustered index, or no index at all @marc_s - what other operations are we talking about that will come to a crawl if I'm specifically only reading from the table?Brett

3 Answers

3
votes

The table will certainly query faster with Integer clustered indexes than GUID indexes. The reason being the size of the data type.

If you have already decided to go with GUIDs as key then probably generate these GUIDs using newSequentialId() instead of NewId() as this would reduce the effects of fragmentation in Guid indexes as the Ids ae always increasing and you have less chances of having a page split.

Adding to my point, it is a natural choice to go with this as a clustered index unless you have a potential candidate for a clustered index i.e. if you are using this guid not for key purposes. If its a relatively small table that is when you have a choice to not have an index else its always good to have indexes.

2
votes

Assuming MS SQL Server. This may or may not apply to other RDBMSs:

If you have a clustered index then it will be fastest, although if you're searching for a single row then the difference between that and a non-clustered index will be negligible. When you use a non-clustered index the server needs to first find the right value in the index and then go fetch the full record from the table storage. The table storage is the clustered index, so searching by a clustered index eliminates that step (called a Bookmark Lookup), but that step is almost imperceptible for a single row.

Clustered indexes tend to provide a bigger advantage for reading when they are on a column that is selected by range (for example, transaction date and you want to find all transactions for the past month). In that case the server can find the start and just read off the data in one quick, sequential sweep.

Having a non-clustered index on an INT (all other things being equal) will be slightly faster than using a GUID because the index itself will be smaller (because INTs are much smaller than GUIDs) which means that the server has to traverse fewer pages to find the value that it's looking to get. In the case of a clustered index I don't think that you'll see much of a difference if your row sizes are already large compared to the difference between a GUID and an INT, but I haven't done any testing on that.

1
votes

Like Tom already mentioned, the search on a clustered index for a single element will allways be faster. This is because the clustered index is thae data itself, and no lookups are requiered after you found your index entry.

The main advantage for a clustered index is the ability to extract "ranges" of data (like "last week", or "Orderhistory by Date"). Since a GUID tends to spread evenly over the table, you will fail to gain this benefit here. Also each table can only have one clustered index, so pick carefully.

If you query a table most commenly for a specific range, then consider that one as a clustered index.

There is also a 3rd kind, which is called a covering index. A covering index consists of several fields, which will be able so satisfy the most common query. For example, you have a USER table with a ID,Displayname,Password,LogonDate,..... and you will need the DisplayName frequently, creating a index based on ID,Displayname would be considered a covering index for a query like

Select Displayname from USER where ID=XYZ

Edit: One thing I forgot to mention. A GUID is quite a large object when it comes to SQL (Well... 16 Bytes). Having it as the clustered index forces all other indices on that table to inlcude the 16 Byte pointer to the GUID. This can add up if you have a bunch of indices on that table. Theclustered index is best is it is small and unique. Thats why INTs are so nice.