0
votes

I am currently working on a project where I am using MongoDB as database with multiple collections. My biggest collection will likely not have more than 10000-15000 documents in the future, currently there are about 5500 documents. There are 8 fields I use for querying the collection. I am now looking for an indexing strategy. My problem is that all 8 fields are not obligatory and can occur in different combinations. This makes creating compound indexes very hard for me. Does anyone have a suggestion how to solve this?

I came up with 15 compound indexes for the collection, which I would think cover some of the most important queries. Are these too many for one collection? I am also periodically inserting/updating data to the database from another data source so I worry about slowing down the writing performance too much. Users only perform reading operations. I am also not sure if indexing makes sense for such rather 'small' collections. With the actual size of 5500 documents non-indexed queries take about 6-8 ms to be completed.

1

1 Answers

0
votes

Probably the collection is too small to get a noticeable improvement. But if that collection is expected to be increased in the future, it will need an indexing strategy.

MongoDB (and many other databases) have support for multiple colums/fields indexing but the order is important. For example, if there are 3 columns A, B, C and a compound index is created as (A, B, C):

  • All queries using A or A+B or A+B+C columns will take advantage of the index
  • The rest of queries wont, for example B or C or B + C or A + C

To allow all possible combinations several indexes are needed (that is quite expensive).

But SURPRISE! MongoDB support indexing multiple values per document (using arrays), it is possible taking advantage of that and apply the pattern called "Attribute pattern": https://www.mongodb.com/blog/post/building-with-patterns-the-attribute-pattern

In the case of searching by several fields, each document could have an extra attribute (for instance "search"), here is an example:

{
    "name": "apple",
    "kind": "fruit",
    "color": "green",
    "size": 33,
    "search": [
        {"k": "name",  "v": "apple"},
        {"k": "kind",  "v": "fruit"},
        {"k": "color", "v": "green"},
        {"k": "size",  "v": 33}
    ]
}

And an a compound index (search.k, search.v). To take advantage of that index, the user query should be adapted/rewritten from this:

{"name":"apple", "color": "green"}

to this:

{
    "$and":[
        {"search":{"$elemMatch": {"k": "name", "v": "apple"}}},
        {"search":{"$elemMatch": {"k": "color", "v": "green"}}}
    ]
}

More about $elemMatch https://docs.mongodb.com/manual/reference/operator/query/elemMatch/

This pattern will produce instant query results in collections with hundreds of millions of documents.