Context: SQL Server 2008. There are 2 tables to inner join. The fact table, which has 40 million rows, contains the patient key and the medications administered and other facts. There is a unique index (nonclustered) on medication key and patient key combined in that order. The dimension table is the medication list (70 rows). The join is to get the medication code (business code) based on medication key (surrogate key). Query:
SELECT a.PKey, a.SomeFact, b.MCode
FROM tblFact a
JOIN tblDIM b ON a.MKey = b.MKey
All the columns returned are integer. The above query runs in 7 minutes and its execution plan shows the index on (MKey,PKey) is used. The index was rebuilt right before the run. When I disabled the index on the fact table (or copy data to a new table with same structure but without index), the same query takes only 1:40 minutes.
IO Statistics are also stunning.
With index: Table 'tblFACT'. Scan count 70, logical reads 190296338, physical reads 685138, read-ahead reads 98713
Without index: Table 'tblFACT_copy'. Scan count 17, logical reads 468891, physical reads 0, read-ahead reads 419768
Question: why does it try to use the index and head down the inefficient path?