0
votes

I have "ratings" documents that record the rating a user gave to a certain item, like this:

accountID: a1, itemID: i1, rating: 1
accountID: a2, itemID: i1, rating: 1
accountID: a3, itemID: i1, rating: 1
accountID: a1, itemID: i2, rating: 1
accountID: a2, itemID: i3, rating: 1

I would like to create a view that shows which users rated the same item, and how many times. Following the above data, we see that a1, a2 and a3 rated the same item i1, while i2 and i3 were only rated individually. The resulting set should then look like:

accountID1: a1, accountID2: a2, numMatches: 1
accountID1: a1, accountID2: a3, numMatches: 1
accountID1: a2, accountID2: a3, numMatches: 1

This shows that a1 and a2 both rated the same item once (i1), as did a1 and a3, and a2 and a3 (all for i1). The other items were ignored since only one user rated them.

Is it possible to achieve this transformation with map/reduce in couchdb/cloudant? Or do I have to do the calculations client-side by pulling all ratings for a given item and running through every account?

1

1 Answers

0
votes

One way you could solve this is to index on itemId+userId, emit the rating as a value, and then use _stats to get whatever info you want. Here's your design document:

{
  _id: "_design/count_shared_reviews",
  views: {
    "count_shared_reviews": {
      map: function(doc) {
        emit([doc.itemID, doc.accountID], doc.rating);
      }.toString(),
      reduce: "_stats"
    }
  }
}

Then e.g. you can do http://localhost:5984/testdb1/_design/count_shared_ratings/_view/count_shared_ratings?reduce=true&group=true&group_level=1, which will group by level 1 (i.e. the item ID), giving you:

{"rows":[
  {"key":["i1"],"value":{"sum":3,"count":3,"min":1,"max":1,"sumsqr":3}},
  {"key":["i2"],"value":{"sum":1,"count":1,"min":1,"max":1,"sumsqr":1}},
  {"key":["i3"],"value":{"sum":1,"count":1,"min":1,"max":1,"sumsqr":1}}
]}

You can also group by everything, in which case you will get per-user, per-item summaries (http://localhost:5984/testdb1/_design/count_shared_ratings/_view/count_shared_ratings?reduce=true&group=true):

{"rows":[
  {"key":["i1","a1"],"value":{"sum":1,"count":1,"min":1,"max":1,"sumsqr":1}},
  {"key":["i1","a2"],"value":{"sum":1,"count":1,"min":1,"max":1,"sumsqr":1}},
  {"key":["i1","a3"],"value":{"sum":1,"count":1,"min":1,"max":1,"sumsqr":1}},
  {"key":["i2","a1"],"value":{"sum":1,"count":1,"min":1,"max":1,"sumsqr":1}},
  {"key":["i3","a2"],"value":{"sum":1,"count":1,"min":1,"max":1,"sumsqr":1}}
]}

Does that make sense?