Why index is not used for some values?

Question

We have table "Customers" with email and siteId columns. And there's an index consisting of those two columns. So, related queries were always using this index when performing queries and filtering by email, like this one:

select * from customers where email = '[email protected]';

We can assure the index was used because we have checked the query with explain plan. Now, we had an interesting case yesterday. The same query with a clause "email = '[email protected]' was not using index at all. Explain plan showed oracle did full scan. The table contains millions of records, so this query without index was hitting 30 seconds. Interesting part is the query did use index with any of other email values ([email protected], [email protected]). What are the reasons why oracle didn't use index?

One reason that comes to me the index does not contain such value, so at first oracle traverses the index and then does full scan, but then explain plan should show it performed index scan. In our case explain plan just shows full scan. The other reason it's somehow related with db restore but also not sure how it can be that index does not contain such value.

UPDATE All the tables after restore, have last_analyzed date value of one week old. In this particular case with the email, all the emails are updated to some random values due to customer privacy after update. Customer table have several millions records so yes, there's a quite huge change after restore but still don't understand how it can be related with not using an email because this customer table already have millions of records before restore.

UPDATE2 the index is used after we performed gathering statistics on the table.

UDATE3 well, all the email values are unique in customer table, so it's unlikely CBO decided there are too many rows with such particular email. It's just there are a lot of emails with the same starting characters "random..." but that should not be considered as identical values, right?

It wouldn't do a full scan if it didn't find an index hit; it will do one or the other, not both. (Did it do a full index scan, or a full table scan?) Assuming your stats are current, does the index have histograms, and do you have a lot of addresses similar to the one that behaved differently? — Alex Poole
I do agree with Alex, this could be possible reason that since the cardinality for a particular email id is high, Oracle is going for full table scan. — San
Were the stats last gathered before the emails were updated to random values? If so perhaps there were a lot of similar (or identical) values originally, and your updated/random value just happens to be aligning with a popular original choice. If gathering stats solved it, then great *8-) — Alex Poole

Steve Steve · Accepted Answer · 2014-01-29T15:19:48

The server will not look at the data in the index and then scan the table if not found. Before it does anything, it will look at the statistics to see if it should use an index or scan the table. I would guess that the email address passed, for some reason, made oracle think it would be better to scan the table than look at the index. There are too many factors to come to an exact conclusion. Check out this information on the statistics Oracle Doc on Stats. You might look at changing the amount of data the statistic for the index covers and it will help.

Why index is not used for some values?

2 Answers