0
votes

Not quite sure what the best practice is if I have two collections, a user collection and a picture collection - I do not want to embed all my pictures into my user collection.

  1. My client searches for pictures under a certain criteria. Let's say he gets 50 pictures back from the search (i.e. one single mongodb query). Each picture is associated to one user. I want the user name displayed as well. I assume there is no way to do a single search performance wise on the user collection returning the names of each user for each picture, i.e. I would have to do 50 searches. Which means, I could only avoid this extra performance load by duplicating data (next to the user_id, also the user_name) in my pictures collection?

  2. Same question the other way around. If my client searches for users and say 50 users are returned from the search through one single query. If I want the last associated picture + title also displayed next to the user data, I would again have to add that to the users collection, otherwise I assume I need to do 50 queries to return the picture data?

2
That is correct, you basically have two options without modifying the structure of your collections. 1. Store a reference to users within your pictures collection which returns the user associates with the picture, or duplicate the data. The documentation for this is located here docs.mongodb.org/manual/reference/database-references. - user2263572

2 Answers

2
votes

Lets say the schema for your picture collection is as such:

Picture Document

{
    _id: Objectid(123),
    url: 'img1.jpg',
    title: 'img_one',
    userId: Objectid(342)
}

1) Your picture query will return documents that look like the above. You don't have to make 50 calls to get the user associated with the images. You can simply make 1 other query to the Users Collection using the user ids taken from the picture documents like such:

db.users.find({_id: {$in[userid_1,user_id2,userid_3,...,userid_n]}})

You will receive an array of user documents with the user information. You'll have to handle their display on the client afterwards. At most you'll need 2 calls.

Alternatively

You could design the schema as such:

Picture Document

{
    _id: Objectid(123),
    url: 'img1.jpg',
    title: 'img_one',
    userId: Objectid(342),
    user_name:"user associated"
}

If you design it this way. You would only require 1 call, but the username won't be in sync with user collection documents. For example lets say a user changes their name. A picture that was saved before may have the old user name.

2) You could design your User Collection as such:

User Document

{
    _id: Objectid(342),
    name: "Steve jobs",
    last_assoc_img: {
           img_id: Object(342)
           url: 'img_one',
           title: 'last image title
    }
}

You could use the same principles as mentioned above.

1
votes

Assuming that you have a user id associated with every user and you're also storing that id in the picture document, then your user <=> picture is a loosely coupled relationship.

In order to not have to make 50 separate calls, you can use the $in operator given that you are able to pull out those ids and put them into a list to run the second query. Your query will basically be in English: "Look at the collection, if it's in the list of ids, give it back to me."

If you intend on doing this a lot and intend for it to scale, I'd either recommend using a relational database or a NoSQL database that can handle joins to not force you into an embedded document schema.