Mongodb compound index not being used

Question

I have a mongodb index with close to 100k documents. On each document, there are the following 3 fields.

arrayX: [ObjectId] someID: ObjectId timestamp: Date

I have created a compound index for the 3 fields in that order.

When I try to then fire an aggregate query (written below in pseudocode), as

match(
  and(
    arrayX: (elematch: A),
    someId: Y
  )
)
sort (timestamp: 1)

it does not end up using the compound index.

The way I know this is when I use .explain(), the winningPlan stage is FETCH, the inputStage is IXSCAN and the indexname is timestamp_1 which means its only using the other single key index i created for the timestamp field.

What's interesting is that if I remove the sort stage, and keep everything the exact same, mongodb ends up using the compound index.

What gives?

Joe Joe · Accepted Answer · 2020-02-13T04:17:24

Multi-key indexes are not useful for sorting. I would expect that a plan using the other index was listed in rejectedPlans.

If you run explain with the allPlansExecution option, the response will also show you the execution times for each plan, among other things.

Since the multi-key index can't be used for sorting the results, that plan would require a blocking sort stage. This means that all of the matching documents must be retrieved and then sorted in memory before sending a response.

On the other hand, using the timestamp_1 index means that documents will be encountered in a presorted order while traversing the index. The tradeoff here is that there is no blocking sort stage, but every document must be examined to see if it matches the query.

For data sets that are not huge, or when the query will match a significant fraction of the collection, the plan without a blocking sort will return results faster.

You might test creating another index on { someID:1, timestamp:1 } as this might reduce the number of documents scanned while still avoiding the blocking sort.

The reason the compound index is selected when you remove the sort stage is because that stage probably accounts for the majority of the execution time.

The fields in the executionStats section of the explain output are explained in Explain Results. Comparing the estimated execution times for each stage may help you determine where you can tune the queries.

Mongodb compound index not being used

2 Answers