strategy for creating MongoDB short ids that scale

Question

I want to have a friendlier facing ids (ie Youtube style: /posts/cxB6Ey6) than MongoDB's ObjectID.

I read that for scalability its best to leave _id as an ObjectID so I thought about two solutions:

1) add an indexed postid field to each document

2) create a mapping collection between _id and the postid

in both cases use something like https://github.com/dylang/shortid to generate the short id, and while generating make sure that the id is unique by querying the database. (can this query-generate-insert be an atomic operation?)

will those solutions have a noticeable impact on performance ?

what's the best strategy for doing this ?

I don't think anyone went and read the shortid code suggested in the first post (github.com/dylang/shortid) this is a unique identifier provided you manage the host identifier on scaling. I will defer to the experts on not messing with the original ObjectID and go with the answer from Sammaye that you just put it into a new field (e.g. PostID) that you index. — Mikey Mr.H

Sammaye Sammaye · Accepted Answer · 2013-01-05T16:16:37

The normal method of doing this is to base64 encode a unique id but:

add an indexed postid field to each document

You definitely want to go for this method. Out of the two I would say this method is easily the most scalable and performant, for one it would only need one round trip to get a short URLs details where as the second option would take 2. Another consideration is the shortage of index overhead of maintaining an extra collection, this is a bit of a no-brainer.

I would not replace the _id field within the document either since the default ObjectId could still be useful in the foreseeable future.

So this limits it down to a separate field and index (unique key) for the short code of a URL.

The next thing is that you don't want an ID which forces you to query the database for uniqueness prior to every insert. This is where the ObjectId shines. The ObjectId is good at being made within the client application while being unique in the database without having to specifically query those assumptions.

Unique ids that do not require querying the database first are normally time based. In PHP ( http://php.net/manual/en/function.uniqid.php ) and in the MongoDB Drivers ( http://docs.mongodb.org/manual/core/object-id/ ) and even the plug-in you linked on github ( https://github.com/dylang/shortid/blob/master/lib/shortid.js#L50 ) they all use time as a basis for being unique.

Considering the plug-in you linked does not query the database to check its own IDs uniqueness I would say that this plug-in probably is quite performant and if you use it with the first solution you stated you should get a good benchmark out of it.

strategy for creating MongoDB short ids that scale

3 Answers