2
votes

I am restoring a large mongo database (90 GB). I am using mongorestore, and it fails silently around 70..90% of the restoring process. Is there a way to skip the n first documents of the dump when launching mongorestore ?. I know there is a filter options where you can give a filter query, but it does not help.

If i try to restore once again the whole backup, it takes a lot of times, due to duplicate index error. I tried to restore the db once again, but it fails once again (due to a socket exception)

As mongorestore seems to process the bson dump sequentially, i was wondering if there is a way to say : " just skip the 1'234'567 first documents of the dump and restore the rest "

I have just one large collection. I have already break the dump in different parts, but it seems not enough. It would be really easier to tell mongorestore to skip the restored documents and go on.

Thanks

1
Is your intention to skip the first n documents or would you be good with restoring the whole backup? Also, how many collections are in your database? I'm thinking you could write a script that does a backup of each collection and then restores it one by one.Juan Carlos Farah
I have already split the dump in parts. See edit above. I would continue this way if there is no way to tell mongorestore to skip documentsChabbey François

1 Answers

2
votes

As far as I know, there is no way to tell MongoDB to skip n documents when doing a mongorestore, but you can take advantage of the --filter option in order to do something that emulates this. Assuming you are using ObjectIds or an _id that has some sort of sequence, you can do a query on your collection to find the _id of the nth document. Something like this:

db.collection.find({}, { "_id" : 1 }).skip(n-1).limit(1);

You can then pass this _id as a parameter to the --filter option, telling it to only restore all documents greater than this _id. something like this:

mongorestore --filter '{"_id": { $gt : "<ID>" }}'

If you want to drop the database before you restore, you can add the --drop option to the command above.

This should only restore the documents with an _id greater than the one for the nth document, effectively skipping the first n documents in the collection.