95
votes

Bit of an odd one on query performance... I need to run a query which does a total count of documents, and can also return a result set that can be limited and offset.

So, I have 57 documents in total, and the user wants 10 documents offset by 20.

I can think of 2 ways of doing this, first is query for all 57 documents (returned as an array), then using array.slice return the documents they want. The second option is to run 2 queries, the first one using mongo's native 'count' method, then run a second query using mongo's native $limit and $skip aggregators.

Which do you think would scale better? Doing it all in one query, or running two separate ones?

Edit:

// 1 query
var limit = 10;
var offset = 20;

Animals.find({}, function (err, animals) {
    if (err) {
        return next(err);
    }

    res.send({count: animals.length, animals: animals.slice(offset, limit + offset)});
});


// 2 queries
Animals.find({}, {limit:10, skip:20} function (err, animals) {            
    if (err) {
        return next(err);
    }

    Animals.count({}, function (err, count) {
        if (err) {
            return next(err);
        }

        res.send({count: count, animals: animals});
    });
});
6
I am unsure about Mongoose however the default count() function in PHP does not take limit or skip into account unless told to so just running one query of limit and skip and then getting the count should give the most performant solution here probably. However how will you kow there are 57 documents if you don't do two queries to count what is currently there? Do you have a static number that never changes? If not then you will need to do both the skip and limit then the count.Sammaye
Sorry, I was talking about using Mongo's native count method db.collection.find(<query>).count();leepowell
Sorry it was me, I misread your question. Hmmm actually I am not sure which would be better, will your result set always be really low like 57 docs? If so then client side slice might be a milisecond more performant.Sammaye
I've added example to the original question, I don't think the data will ever get as high as 10,000+ but potentially it could.leepowell
At 10k records you could see the memory handling of JS be less performant than the count() function of MongoDB. The count() function in MongoDB is relatively slow but it is still pretty much as fast as most client side variations on larger sets and it could be faster than client side counting here possibly. But that part is subjective to your own testing. Mind you I have counted 10k length arrays easily before so it might be faster client side, it is very hard to say at 10k elements.Sammaye

6 Answers

147
votes

I suggest you to use 2 queries:

  1. db.collection.count() will return total number of items. This value is stored somewhere in Mongo and it is not calculated.

  2. db.collection.find().skip(20).limit(10) here I assume you could use a sort by some field, so do not forget to add an index on this field. This query will be fast too.

I think that you shouldn't query all items and than perform skip and take, cause later when you have big data you will have problems with data transferring and processing.

25
votes

Instead of using 2 separate queries, you can use aggregate() in a single query:

Aggregate "$facet" can be fetch more quickly, the Total Count and the Data with skip & limit

    db.collection.aggregate([

      //{$sort: {...}}

      //{$match:{...}}

      {$facet:{

        "stage1" : [ {"$group": {_id:null, count:{$sum:1}}} ],

        "stage2" : [ { "$skip": 0}, {"$limit": 2} ]
  
      }},
     
     {$unwind: "$stage1"},
  
      //output projection
     {$project:{
        count: "$stage1.count",
        data: "$stage2"
     }}

 ]);

output as follows:-

[{
     count: 50,
     data: [
        {...},
        {...}
      ]
 }]

Also, have a look at https://docs.mongodb.com/manual/reference/operator/aggregation/facet/

4
votes
db.collection_name.aggregate([
    { '$match'    : { } },
    { '$sort'     : { '_id' : -1 } },
    { '$facet'    : {
        metadata: [ { $count: "total" } ],
        data: [ { $skip: 1 }, { $limit: 10 },{ '$project' : {"_id":0} } ] // add projection here wish you re-shape the docs
    } }
] )

Instead of using two queries to find the total count and skip the matched record.
$facet is the best and optimized way.

  1. Match the record
  2. Find total_count
  3. skip the record
  4. And also can reshape data according to our needs in the query.
3
votes

After having to tackle this issue myself, I would like to build upon user854301's answer.

Mongoose ^4.13.8 I was able to use a function called toConstructor() which allowed me to avoid building the query multiple times when filters are applied. I know this function is available in older versions too but you'll have to check the Mongoose docs to confirm this.

The following uses Bluebird promises:

let schema = Query.find({ name: 'bloggs', age: { $gt: 30 } });

// save the query as a 'template'
let query = schema.toConstructor();

return Promise.join(
    schema.count().exec(),
    query().limit(limit).skip(skip).exec(),

    function (total, data) {
        return { data: data, total: total }
    }
);

Now the count query will return the total records it matched and the data returned will be a subset of the total records.

Please note the () around query() which constructs the query.

3
votes

There is a library that will do all of this for you, check out mongoose-paginate-v2

1
votes

You don't have to use two queries or one complicated query with aggregate and such.

You can use one query

example:

const getNames = async (queryParams) => {

  const cursor = db.collection.find(queryParams).skip(20).limit(10);
  return {
    count: await cursor.count(),
    data: await cursor.toArray()
  }
  
}

mongo returns a cursor that has predefined functions such as count, which will return the full count of the queried results regardless of skip and limit

So in count property, you will get the full length of the collection and in data, you will get just the chunk with offset of 20 and limit of 10 documents