1
votes

I have a collection that stores user activity data pertaining to whether users up vote or down vote on a select list of keywords as they relate to a catalogue of titles. I need to execute an aggregation query in Mongo that gives me the cumulative score of each keyword that is tied to a title across all users - an up vote is represented by vote: true and a downvote by vote: false

Used the unwind operator as my first step followed by the group by but this is where I'm lost as I'm not sure whether to group by a combination of the titleId, keyword and vote; just the titleId and keyword or just the titleId. Here is a sample of what the data looks like in the collection -

Expecting to see one document grouped by titleId at the end of the aggregation query with a an array of objects, where each object contains a keyword and the cumulative vote total for that keyword in the context of that title.

{ 
    "_id" : ObjectId("5d2eb0b20e91d1941540e2b5"), 
    "titleId" : "111222", 
    "userId" : "993f1e69-c6b0-44eb-a8a6-5db5dc42d425", 
    "activity" : [
        {
            "keyword" : "Road Trip", 
            "vote" : true
        }, 
        {
            "keyword" : "Friendship", 
            "vote" : true
        },
        {
            "keyword" : "Family Movie", 
            "vote" : true
        }
    ]
}
// ----------------------------------------------
{ 
    "_id" : ObjectId("5d38fd270e91d1941559bee0"), 
    "titleId" : "111222", 
    "userId" : "f92c7118-8930-4d30-8e07-8363737a6866", 
    "activity" : [
        {
            "keyword" : "Road Trip", 
            "vote" : false
        }, 
        {
            "keyword" : "Adventure", 
            "vote" : true
        }, 
        {
            "keyword" : "Family Movie", 
            "vote" : true
        }
    ]
}
// ----------------------------------------------
{ 
    "_id" : ObjectId("5d38fd270e91d1941559bee0"), 
    "titleId" : "111222", 
    "userId" : "f92c7118-8930-4d30-8e07-8363737a6866", 
    "activity" : [
        {
            "keyword" : "Road Trip", 
            "vote" : false
        }, 
        {
            "keyword" : "Adventure", 
            "vote" : false
        }, 
        {
            "keyword" : "Family Movie", 
            "vote" : false
        }
    ]
}
// ----------------------------------------------
{ 
    "_id" : ObjectId("5d2eb0b20e91d1941540e2b5"), 
    "titleId" : "444222", 
    "userId" : "993f1e69-c6b0-44eb-a8a6-5db5dc42d425", 
    "activity" : [
        {
            "keyword" : "Educational film and videos", 
            "vote" : true
        }, 
        {
            "keyword" : "Space", 
            "vote" : true
        }
    ]
}
// ----------------------------------------------
{ 
    "_id" : ObjectId("5d38fd270e91d1941559bee0"), 
    "titleId" : "444222", 
    "userId" : "f92c7118-8930-4d80-8e07-8363737a6866", 
    "activity" : [
        {
            "keyword" : "Action", 
            "vote" : false
        }, 
        {
            "keyword" : "Adventure", 
            "vote" : false
        }, 
        {
            "keyword" : "Sci Fi", 
            "vote" : true
        }
    ]
}

Expecting to see one document grouped by titleId with a an array of objects, where each object contains a keyword and the cumulative vote total for that keyword in the context of that title.

1

1 Answers

1
votes

The following query can get us the expected output:

db.collection.aggregate([
    {
        $unwind:"$activity"
    },
    {
        $group:{
            "_id":{
                "titleId":"$titleId",
                "keyword":"$activity.keyword"
            },
            "votes":{
                $sum:{
                    $cond:[
                        {
                            $eq:["$activity.vote",true]
                        },
                        1,
                        -1
                    ]
                }
            }
        }
    },
    {
        $group:{
            "_id":"$_id.titleId",
            "titleId":{
                $first:"$_id.titleId"
            },
            "activity":{
                $push:{
                    "keyword":"$_id.keyword",
                    "votes":"$votes"
                }
            }
        }
    },
    {
        $project:{
            "_id":0
        }
    }
]).pretty()

Output:

{
    "titleId" : "111222",
    "activity" : [
        {
            "keyword" : "Road Trip",
            "votes" : -1
        },
        {
            "keyword" : "Adventure",
            "votes" : 0
        },
        {
            "keyword" : "Friendship",
            "votes" : 1
        },
        {
            "keyword" : "Family Movie",
            "votes" : 1
        }
    ]
}
{
    "titleId" : "444222",
    "activity" : [
        {
            "keyword" : "Adventure",
            "votes" : -1
        },
        {
            "keyword" : "Sci Fi",
            "votes" : 1
        },
        {
            "keyword" : "Action",
            "votes" : -1
        },
        {
            "keyword" : "Educational film and videos",
            "votes" : 1
        },
        {
            "keyword" : "Space",
            "votes" : 1
        }
    ]
}

Aggregation stages:

  • Stage I: Unwind 'activity' array. Now there is one document per element of the array
  • Stage II: Grouping on the basis of title ID and activity keyword and calculating the votes count. 1 for upvote and -1 for the downvote.
  • Stage III: Grouping only on the basis of title ID and merging all activities into an array
  • Stage IV: Eliminating unnecessary fields from the output