0
votes

I'm having troubles understanding basic concepts of compound indexes. Let's say I have a collection with this schema:

{
    _id: 1,
    field1: 'aaa',
    field2: 'bbb',
    field3: 'ccc'
}

In https://docs.mongodb.org/manual/core/index-compound/ it says very clearly, and I understand, that this index: { field1: 1, field: 2} will support queries only on field1 and on both field1 AND field2, this make sense.

Now, What if I want to support queries on all fields (1, 2 and 3) where field 1 is always used? I want to support the option to query on field1 AND field2 and field1 AND field3 on all field1 AND field2 AND field3.

I know it's possible to create compound index for each of these query options, but the question is if I can use less compounds that will query in the same performance.

For example, if I have an index: {field1: 1, field2: 1, field3: 1} and I query for field1 AND field3, it will use this index, but the index bounds on field2 will be the "[MinKey, MaxKey]". Does it worth it to create another index: {field1: 1, field3: 1}?

1
Maybe you can test your data with explian, with and without the index {field1: 1, field3: 1}, compare those two results, it will give you some clue...zangw
@zangw I did that and that's why I know it uses the index with [MinKey, MaxKey] index bounds. But it doesn't give me performance evaluationTomG

1 Answers

0
votes

If you try creating a collection with those documents, and indexes, then doing an explain on the query, mongo will select the index: {field1: 1, field3: 1} and reject the other plans. This suggests that mongo finds one index preferable to the other index.

For example the explain returns the following:

"winningPlan" : {
    "stage" : "FETCH",
    "inputStage" : {
        "stage" : "IXSCAN",
        "keyPattern" : {
            "field1" : 1,
            "field3" : 1
        },
        "indexName" : "field1_1_field3_1",
        "isMultiKey" : false,
        "isUnique" : false,
        "isSparse" : false,
        "isPartial" : false,
        "indexVersion" : 1,
        "direction" : "forward",
        "indexBounds" : {
            "field1" : [
                "[\"aaa\", \"aaa\"]"
            ],
            "field3" : [
                "[\"ccc\", \"ccc\"]"
            ]
        }
    }
},
"rejectedPlans" : [
    {
        "stage" : "FETCH",
        "filter" : {
            "field3" : {
                "$eq" : "ccc"
            }
        },
        "inputStage" : {
            "stage" : "IXSCAN",
            "keyPattern" : {
                "field1" : 1,
                "field2" : 1
            },
            "indexName" : "field1_1_field2_1",
            "isMultiKey" : false,
            "isUnique" : false,
            "isSparse" : false,
            "isPartial" : false,
            "indexVersion" : 1,
            "direction" : "forward",
            "indexBounds" : {
                "field1" : [
                    "[\"aaa\", \"aaa\"]"
                ],
                "field2" : [
                    "[MinKey, MaxKey]"
                ]
            }
        }
    },
    {
        "stage" : "FETCH",
        "inputStage" : {
            "stage" : "IXSCAN",
            "keyPattern" : {
                "field1" : 1,
                "field2" : 1,
                "field3" : 1
            },
            "indexName" : "field1_1_field2_1_field3_1",
            "isMultiKey" : false,
            "isUnique" : false,
            "isSparse" : false,
            "isPartial" : false,
            "indexVersion" : 1,
            "direction" : "forward",
            "indexBounds" : {
                "field1" : [
                    "[\"aaa\", \"aaa\"]"
                ],
                "field2" : [
                    "[MinKey, MaxKey]"
                ],
                "field3" : [
                    "[\"ccc\", \"ccc\"]"
                ]
            }
        }
    }
]

However there is the issue of the index size, and whether all the indexes you'd be creating will fit into RAM alongside the database working set. If they dont then it will hold the index on disk, which is significantly slower to read. The best compromise then might be to use the index on all three fields, since this is better than no index, and involves less maintenance than three indexes