6
votes

So I had this issue today where my MongoDB queries where extremely slow and timing out. I posted this question - MongoDB too many records? and his suggestion was right wherein I had to ensureIndex and remove case-insensitivity. I tried it in Mongo shell, it worked perfectly.

However, when I ran it via PHP, it was still the same :( I then realized the query had a sort on "id" (not _id) field and when I removed that, things were blazing fast. But with the sort, it was REALLY slow. I already had an index on the id field. This is the query btw :

db.tweet_data.find({ 
...     $or: 
...         [ 
...             { in_reply_to_screen_name: /^kunalnayyar$/, handle: /^kaleycuoco$/, id: { $gt: 0 } }, 
...             { in_reply_to_screen_name: /^kaleycuoco$/, handle: /^kunalnayyar$/, id: { $gt: 0 } } 
...         ], 
...     in_reply_to_status_id_str: { $ne: null } 
...     
...     } ).sort({id:-1})explain()

So my indexes are : (not composite) { {id:-1} , {handle:1}, {in_reply_to_screen_name:1} } After some reading I realized it should have been a composite index and I tried two variations to no success : 1. {handle:1, in_reply_to_screen_name:1, id:-1} 2. {id:-1,handle:1, in_reply_to_screen_name:1}

I am not sure where I am going wrong, but I am pretty sure the issue is indexing here. I am just too buzzed and can't understand the order and the fields to index

1
Depends on the order of fields in your document..Try {in_reply_to_screen_name:1, handle:1} or {in_reply_to_screen_name:1, handle:1, id:-1}Aafreen Sheikh
Only {in_reply_to_screen_name:1, handle:1} won't work because there is a sort on {id:-1} I have tried both variations of handle, in_reply, id but nothing has worked.Ayush Chaudhary
{in_reply_to_screen_name:1, handle:1, id:-1} should do..And this is different from the two that you mention you have tried..When I said, "depends on the order of fields in the document", I meant the "search document".Aafreen Sheikh
@Ayush: do you need the id: { $gt: 0 } in the query, or has that been removed? If that is actually a default ObjectId you shouldn't need to assert it is greater than 0. Similarly, you may want to include in_reply_to_status_id_str in the $or queries/index rather than separately. I would try explaining the individual queries first before combining in an $or.Stennie
Also, have a look at the nScanned numbers in the explain() versus the n.Stennie

1 Answers

1
votes

You should run explain against your query, it will help you figure out what's going on.

It's likely that Mongo isn't using an index for both filtering and sorting. When you use an $or, it can use multiple indexes to match the options. But when you add a sort it may make it not use indexes available for filtering.

When you want to sort on a query, you need to make sure the sorted field is in the index you want to hit (last, or it can't use it to sort).

You may be able to speed it up by passing an index hint, too. I don't know how many docs your query matches, but if it's a small number and you make sure the initial conditions are hitting an index, the sort on _id can be done quickly.