2
votes

note: I'm using Mongodb 4 and I must use aggregation, because this is a step of a bigger aggregation

Problem

How to find in a collection documents that contains fields that starts with value from another field in same document ?

Let's start with this collection:

db.regextest.insert([
{"first":"Pizza", "second" : "Pizza"},
{"first":"Pizza", "second" : "not pizza"},
{"first":"Pizza", "second" : "not pizza"}
])

and an example query for exact match:

db.regextest.aggregate([
{
    $match :  { $expr: { $eq: [ "$first" ,"$second" ] }   }    }
])

I will get a single document

{
    "_id" : ObjectId("5c49d44329ea754dc48b5ace"),
    "first" : "Pizza",    "second" : "Pizza"
}

And this is good.

But how to do the same, but with startsWith ? My plan was to use regex but seems that is not supported in aggregation so far.

With a find and a custom javascript function works fine:

db.regextest.find().forEach(
    function(obj){
        if (obj.first.startsWith(obj.second)){
            print(obj);
        }
    }
)

And returns correctly:

{
    "_id" : ObjectId("5c49d44329ea754dc48b5ace"),
    "first" : "Pizza",
    "second" : "Pizza"
}

How it's possible to get same result with aggregation framework ?

One idea is to use existing aggregation framework pipeline, out to a temp colletion and then run the find above, to get match I'm looking for. This seems to be a workaround, I hope someone have a better idea.

Edit: here the solution

db.regextest.aggregate([{
        $project : {
            "first" : 1,
            "second" : 1,
            fieldExists : {
                $indexOfBytes : ['$first', '$second' , 0]
            }
        }
    }, {
        $match : {
            fieldExists : {
                $gt : -1
            }
        }
    }
]);
1
You can use $indexOfCPAshh
thanks for suggestions! I've found my answer here: stackoverflow.com/questions/3305561/…Vokail
@AsyaKamsky: Could you be so kind and explain your solution a bit, please? Why is it better?B--rian
@B--rian sure - it uses aggregation expression (via $expr) referring to as "f1" and "f2" field names (called in the question "first" and "second") to check for equality between two strings: "$f2" and substring of f1 from position 0 which is size of f2 string). So if I have f1: "pizza is good" and f2: "pizza" then substring of of f1 from position 0 that's 5 characters is 'pizza' and it's a match. Come to think of it, I don't think this is a duplicate of marked question, I'll see about getting it unduped and adding the answer.Asya Kamsky
as far as why it's better - the request was for a $match and it does a $match in a single stage without adding any superfluous fields to the document. Not to mention the fact that $project loses the rest of the fields except _id.Asya Kamsky

1 Answers

1
votes

The simplest way is to use $expr first available in 3.6 like this:

{$match:{$expr:{$eq:[
    “$second”,
    {$substr:{
        “$first”,
        0,
        {$strLenCP:”$second”}
    }}
]}}}

This compare the string in field ‘second’ with the first N characters of ‘first’ where N is the length of second string. If they are equal, then ‘first’ starts with ‘second’.

4.2 adds support for $regex in aggregation expressions, but starts with is much simpler and doesn’t need regular expressions.