1
votes

I have a MongoDB collection with documents in the following format:

    { "_id" : 1, "tokens": [ "I", "have", "a", "dream" ] },
    { "_id" : 2, "tokens": [ "dream", "a", "little", "dream" ] },
    { "_id" : 3, "tokens": [ "dream", "a", "dream" ] },
    { "_id" : 4, "tokens": [ "a" , "little", "dream" ] },
    ...

I need to get all doucuments which "tokens" include contiguous array elements: "a", "dream". So, the following are matched doucuments:

    { "_id" : 1, "tokens": [ "I", "have", "a", "dream" ] },
    { "_id" : 3, "tokens": [ "dream", "a", "dream" ] },

Is there a way to get the right results?

1

1 Answers

0
votes

A trick that is to have a regexp.

  • $match to get the all documents which has $all array input
  • $addFields to have a duplicate the tokens and input array
  • $reduce helps to concat all string joining -
  • $regexMatch to match both strings
  • $match to eliminate unwanted data
  • $project to get necessary fields only

The code is

[{
    $match: {
        tokens: { $all: ["a", "dream"] }
    }
}, {
    $addFields: {
        duplicate: "$tokens",
        inputData: ["a", "dream"]
    }
}, {
    $addFields: {
        duplicate: {
            $reduce: {
                input: "$duplicate",
                initialValue: "",
                in: { $concat: ["$$value", "-", "$$this"] }
            }
        },
        inputData: {
            $reduce: {
                input: "$inputData",
                initialValue: "",
                in: { $concat: ["$$value", "-", "$$this"] }
            }
        }
    }
}, {
    $addFields: {
        match: {
            $regexMatch: { input: "$duplicate", regex: '$inputData' }
        }
    }
}, {
    $match: {
        match: true
    }
}, {
    $project: {  _id: 1,  tokens: 1 }
}]

Working Mongo playground

Note: Do check multiple scenarios although its working for this scenario