1
votes

Let's say I have a match collection in the following format

     {user1: "a", user2: "b"},
        {user1: "a", user2: "c"},
        {user1: "b", user2: "d"},
        {user1: "b", user2: "c"},
        {user1: "b", user2: "e"},
        {user1: "c", user2: "f"}

I would like to know which user has the most appearance (either in user1 or user2). The result should be in this format ordered by the number of occurence.

{"user": "b", count:4},
            {"user": "c", count:3},
            {"user": "a", count:2},
            {"user": "d", count:1},
            {"user": "f", count:1},
            {"user": "e", count:1}

Is there a way I can group on the value of two fields?

Something like match.aggregate({$group: {_id: {$or:["user1","user2]}}, count:{$sum:1}})

3

3 Answers

7
votes
db.match.aggregate([
   {$project: { user: [ "$user1", "$user2" ]}},
   {$unwind: "$user"},
   {$group: {_id: "$user", count: {$sum:1}}}
])

First stage projects each document into array of users

    {user: ["a", "b"]},
    {user: ["a", "c"]},
    {user: ["b", "d"]},
    ...

Next we unwind arrays

    {user:"a"},
    {user:"b"},
    {user:"a"},
    {user:"c"},
    {user:"b"},
    ...

And simple grouping at the end

0
votes

Basically the concept is to $map onto an array and work from there:

db.collection.aggregate([
  { "$project": {
    "_id": 0,
    "user": { "$map": {
      "input": ["A","B"],
      "as": "el",
      "in": {
        "$cond": {
          "if": { "$eq": [ "$$el", "A" ] },
          "then": "$user1",
          "else": "$user2"
        }
      }
    }}
  }},
  { "$unwind": "$user" },
  { "$group": {
    "_id": "$user",
    "count": { "$sum": 1 }
  }}
])
0
votes

Let us take an example and go through

db.users_data.find();
{
    "_id" : 1,
    "user1" : "a",
    "user2" : "aa",
    "status" : "NEW",
    "createdDate" : ISODate("2016-05-03T08:52:32.434Z")
},
{
    "_id" : 2,
    "user1" : "a",
    "user2" : "ab",
    "status" : "NEW",
    "createdDate" : ISODate("2016-05-03T09:52:32.434Z")
},
{
    "_id" : 3,
    "user1" : "b",
    "user2" : "aa",
    "status" : "NEW",
    "createdDate" : ISODate("2016-05-03T10:52:32.434Z")
},
{
    "_id" : 4,
    "user1" : "b",
    "user2" : "ab",
    "status" : "NEW",
    "createdDate" : ISODate("2016-05-03T10:52:32.434Z")
},
{
    "_id" : 5,
    "user1" : "a",
    "user2" : "aa",
    "status" : "OLD",
    "createdDate" : ISODate("2015-05-03T08:52:32.434Z")
},
{
    "_id" : 6,
    "user1" : "a",
    "user2" : "ab",
    "status" : "OLD",
    "createdDate" : ISODate("2015-05-03T08:52:32.434Z")
},

Then

db.users_data.aggregate([
        {"$group" : {_id:{user1:"$user1",user2:"$user2"}, count:{$sum:1}}} ])
    ])

will give the resuls as

{ "_id" : { "user1" : "a", "user2" : "aa" }, "count" : 2}
{ "_id" : { "user1" : "a", "user2" : "ab" }, "count" : 2}
{ "_id" : { "user1" : "b", "user2" : "aa" }, "count" : 1}
{ "_id" : { "user1" : "b", "user2" : "ab" }, "count" : 1}

Thus grouping by multiple ids are possible Now one more variation

db.users_data.aggregate([
        {"$group" : {_id:{user1:"$user1",user2:"$user2",status:"$status"}, count:{$sum:1}}} ])
    ])

will give the resuls as

{ "_id" : { "user1" : "a", "user2" : "aa","status":"NEW" }, "count" : 1}
{ "_id" : { "user1" : "a", "user2" : "ab","status":"NEW" }, "count" : 1}
{ "_id" : { "user1" : "b", "user2" : "aa","status":"NEW" }, "count" : 1}
{ "_id" : { "user1" : "b", "user2" : "ab","status":"NEW" }, "count" : 1}
{ "_id" : { "user1" : "a", "user2" : "aa","status":"OLD" }, "count" : 1}
{ "_id" : { "user1" : "a", "user2" : "ab","status":"OLD" }, "count" : 1}