2
votes

Consider following 3 documents (simplified):

{
    "name" : "Alex Ham",
    "opponents" : [ 
        {
            "location" : "west",
            "position" : 5
        },
        {
            "location" : "north",
            "position" : 7
        },
        {
            "location" : "east",
            "position" : 2
        }
    ]
},
{
    "name" : "John Flex",
    "opponents" : [ 
        {
            "location" : "north",
            "position" : 9
        },
        {
            "location" : "north",
            "position" : 4
        },
        {
            "location" : "south",
            "position" : 2
        }
    ]
},
{
    "name" : "Adam Will",
    "opponents" : [ 
        {
            "location" : "east",
            "position" : 6
        },
        {
            "location" : "south",
            "position" : 8
        }
    ]
}

I need to match documents by item sequence in array opponents. I'll have integers such as 572, 942, 68, etc. Each digit of the integer represents position in opponents array.

For example, for 572, I need to match first document (Alex Ham), because if you look at the opponents array, the sequence of items and their position fields are 5, 7, and 2 respectively.

For 942, I've to match the second document (John Flex), and for 68, I've to match the third document (Adam Will), and so on.

Considering to have huge data (a few million documents), and taking performance into consideration, how can I build an aggregation pipeline to match document for above case.

2

2 Answers

0
votes

I do not think you need to use aggregation in this case. To achieve this task with aggregation you would need to go through the whole collection and project all documents to a new form containing a value which can be matched against your input. Although aggregation is fast, that is still not good enough.

Instead I suggest formating your input to a form that can be match against the documents:

const val = 572;

const arr = val.toString().split('');
const size = arr.length;

const selector = arr.map((v, i) => {
  return {
    [`opponents.${i}.position`]: parseInt(v, 10),
  };
}).reduce((acc, cur) => {
  return Object.assign(acc, cur);
}, {});

selector.opponents = {
  $size: size,
};
console.log(selector);

Now using this new selector, you can use the normal .find operation to get the documents:

collection.find(arr);
0
votes

Please go through the pipeline below which solves the problem. The basic logic is to $unwind while preserving index, and using index to compute the place value of the integer using the formula position * (pow(10, size - index - 1))

[
{$project: {name: 1, opponents: 1, sz: {$size: '$opponents'}}},
{$unwind: {path: '$opponents', includeArrayIndex: 'i'}},
{$project: {name: 1, opponents: 1, 
    num: {$multiply: [{$pow: [10, {$subtract: [{$subtract: ['$sz', 1]}, '$i']}]}, '$opponents.position']}
    }},
{$group: {
    _id: '$_id', 
    num: {$sum: '$num'}, 
    name: {$first: '$name'}, 
    opponents: {$push: '$opponents'}
}},
{$match: {num: 572}},
]