2
votes

i'm looking to get a specific query using the aggregation framework of mongoDB. I think i need the $group and $addToSet operators but i'm confused about the right query to use.

This is the article collection:

/* 0 */
{
    "_id" : 4,
    "author" : "Kevin Vanhove",
    "book" : {
        "order" : 500,
        "title" : "HTML",
        "url" : "html"
    },
    "chapter" : {
        "img" : "navChapter-logo",
        "order" : 500,
        "title" : "W3C",
        "url" : "w3c"
    },
    "featured" : 0,
    "heading" : [ 
        {
            "title" : "title1",
            "_id" : ObjectId("53130fb8b9b9f573a877401d")
        }, 
        {
            "title" : "title2",
            "_id" : ObjectId("53130fb8b9b9f573a877401c")
        }
    ],
    "intro" : "Some intro text",
    "title" : "Internet with and without the W3C",
    "url" : "internet-with-and-without-the-W3C"
}

/* 1 */
{
    "_id" : 1,
    "author" : "Kevin Vanhove",
    "book" : {
        "order" : 500,
        "title" : "Javascript",
        "url" : "javascript"
    },
    "chapter" : {
        "img" : "navChapter-logo",
        "order" : 500,
        "title" : "Functions",
        "url" : "functions"
    },
    "featured" : 1,
    "heading" : [ 
        {
            "title" : "Parts of a function",
            "_id" : ObjectId("53130e0cd8517b65a614c1ab")
        }, 
        {
            "title" : "Something else",
            "_id" : ObjectId("53130e0cd8517b65a614c1aa")
        }
    ],
    "intro" : "Some intro text",
    "title" : "A visual illustration of the JS function",
    "url" : "a-visual-illustration-of-the-javascript-function"
}

/* 2 */
{
    "_id" : 2,
    "author" : "Kevin Vanhove",
    "book" : {
        "order" : 500,
        "title" : "Javascript",
        "url" : "javascript"
    },
    "chapter" : {
        "img" : "navChapter-logo",
        "order" : 300,
        "title" : "Variables",
        "url" : "variables"
    },
    "featured" : 1,
    "heading" : [ 
        {
            "title" : "Global vs local",
            "_id" : ObjectId("53130ea8a9dc9c28a77ea28a")
        }, 
        {
            "title" : "Variable hoisting",
            "_id" : ObjectId("53130ea8a9dc9c28a77ea289")
        }, 
        {
            "title" : "The scope chain",
            "_id" : ObjectId("53130ea8a9dc9c28a77ea288")
        }
    ],
    "intro" : "Some intro text",
    "title" : "How variable scope works in javascript",
    "url" : "how-variable-scope-works-in-javascript"
}

/* 3 */
{
    "__v" : 0,
    "_id" : 3,
    "author" : "Kevin Vanhove",
    "book" : {
        "order" : 500,
        "title" : "Javascript",
        "url" : "javascript"
    },
    "chapter" : {
        "img" : "navChapter-logo",
        "order" : 600,
        "title" : "Functions",
        "url" : "functions"
    },
    "featured" : 0,
    "heading" : [ 
        {
            "title" : "title1",
            "_id" : ObjectId("53130f60f0de2506a81e2d62")
        }, 
        {
            "title" : "title2",
            "_id" : ObjectId("53130f60f0de2506a81e2d61")
        }
    ],
    "intro" : "Some intro text",
    "title" : "Javascript closures, in depth",
    "url" : "Javascript-closure-in-depth"
}

I need to have all the 'unique' books and their 'unique' chapters (no duplicates), so i use this query:

/*
db.articles.aggregate({$group : {_id : "$book.title", chapters:{$addToSet:"$chapter.title"}}})
*/

This gives me this result:

/* 0 */
{
    "result" : [ 
        {
            "_id" : "Javascript",
            "chapters" : [ 
                "Variables", 
                "Functions"
            ]
        }, 
        {
            "_id" : "HTML",
            "chapters" : [ 
                "W3C"
            ]
        }
    ],
    "ok" : 1
}

That is almost what i want but not completely. What actually need is this:

/* 0 */
{
    "result" : [ 
        {
            "_id" : "Javascript",
            "chapters" : [ 
                {
                    "img" : "navChapter-logo",
                    "order" : 600,
                    "title" : "Functions",
                    "url" : "functions"
                }, 
                {
                    "img" : "navChapter-logo",
                    "order" : 300,
                    "title" : "Variables",
                    "url" : "variables"
                }
            ]
        }, 
        {
            "_id" : "HTML",
            "chapters" : [ 
                {
                    "img" : "navChapter-logo",
                    "order" : 500,
                    "title" : "W3C",
                    "url" : "w3c"
                }
            ]
        }
    ],
    "ok" : 1
}

So i need all the unique books with their unique chapters, but i also want the extra fields added like "order" and "url". The query i'm using at the moment only gives me the chapter titles.

Update:

I also tried: $addToSet:"$chapter" instead of $addToSet:"$chapter.title"...

But now i get duplicates on the chapter.title field. I should get only 2 distinct chapters in book 'javascript', and now i get 3 chapters (1 duplicate)

1
$addToSet only accumulates unique values, so the reason that you are getting duplicates if you try $addToSet: "$chapter" is because you have "duplicate" chapters that have different order fields in different instances (for example, the 'Functions' chapter). In other words, they're not actually duplicates. Do you want to exclude them anyway?voithos
yes, i don't need the "order" fieldkevinius
what value do you want for order? for Javascript chapter Functions you have order 500 in one document and order 600 in another. what do you want in the result?Asya Kamsky

1 Answers

1
votes

You can use $group to aggregate by book and chapter.

db.articles.aggregate(

{$group : {_id : {t:"$book.title",c:"$chapter.title"}, 
           img:{$first:"$chapter.img"},
           url:{$first:"$chapter.url"},
           order:{$sum:"$chapter.order"}
}})

In case order was important I kept it and added them across the "duplicate" chapters for same book.