My question arises from the following page in the MongoDB documentation: Model One-to-Many Relationships with Document References
The page presents two options to model one-to-many relationships with document references.
Option #1:
Store an array of references in the "one" document. For example, the books
field in the "publisher" document:
{
_id: "oreilly",
name: "O'Reilly Media",
founded: 1980,
location: "CA",
books: [123456789, 234567890, ...]
}
{
_id: 123456789,
title: "MongoDB: The Definitive Guide",
author: [ "Kristina Chodorow", "Mike Dirolf" ],
published_date: ISODate("2010-09-24"),
pages: 216,
language: "English"
}
{
_id: 234567890,
title: "50 Tips and Tricks for MongoDB Developer",
author: "Kristina Chodorow",
published_date: ISODate("2011-05-06"),
pages: 68,
language: "English"
}
Option #2:
Store a single reference in the "many" documents. For example, the publisher_id
field in the "book" documents:
{
_id: "oreilly",
name: "O'Reilly Media",
founded: 1980,
location: "CA"
}
{
_id: 123456789,
title: "MongoDB: The Definitive Guide",
author: [ "Kristina Chodorow", "Mike Dirolf" ],
published_date: ISODate("2010-09-24"),
pages: 216,
language: "English",
publisher_id: "oreilly"
}
{
_id: 234567890,
title: "50 Tips and Tricks for MongoDB Developer",
author: "Kristina Chodorow",
published_date: ISODate("2011-05-06"),
pages: 68,
language: "English",
publisher_id: "oreilly"
}
As the MongoDB documentation points out, one advantage of option #2 is to avoid mutable, growing arrays. That makes a ton of sense, but I'd like to know if there are other advantages/disadvantages to either options generally.
But in particular, I'm curious to know if the "get all books for a publisher" query would be faster when using one of these two options (assuming you already retrieved the "publisher" document).
With option #1, the query would look like:
db.books.find( { _id : { $in : [123456789, 234567890, ...] } } );
With option #2, the query would look like:
db.books.find( { publisher_id : "oreilly" } );
The query for option #2 looks more simple, but is it faster?
[EDIT]: Ultimately, I want to know if there is a single advantage to option #1?