5
votes

I'm pretty new to CouchDB and I still have some problems wrapping my head around the whole MapReduce way of querying my data...

To stay with the traditional "Blog" example, let's say I have 2 types of documents: post and comment... each comment document has a post_id field...

Is there a way I can get a list of posts with the number of comments for each of these posts with only 1 query? Let's say I want to display a list of post titles with the number of comments for each post like this:

My First Post: 4 comments
My Second Post: 6 comments
....

I know I can do the following:

function(doc) {
    if(doc.type == "comment") {
        emit(doc.post_id, 1);
    }
}

and then reduce it like this:

function (key, values, rereduce) {
    return sum(values);
}

which gives me a list of each blog post id, with the number of comments for each posts. But then I need to fetch the blog posts titles separately since the only thing I have right now is their id...

So, is there a way I could retrive a list of each blog post titles, with the number of comments for each posts, by doing only 1 query?

3

3 Answers

2
votes

You could do something like this:

function(doc) {
    if(doc.type == "post") {
        emit([doc._id, 'title', doc.title], 0);
    }
    if(doc.type == "comment") {
        emit([doc.post_id, 'comments'], 1);
    }
}

Then you'd get a view where each post gets two rows, one with the title and one with the comments.

You can merge the rows together on the client, or you can use a "list" function to merge these groups of rows together within couchdb:

http://wiki.apache.org/couchdb/Formatting_with_Show_and_List

function list(head, req) {
  var post;
  var row;
  var outputRow = function() {
     if(post) { send(post); }
  }
  while(row = getRow()) {
    if(!post || row.key[0] != post.id) {
      outputRow();
      post = {id:row.key[0]};
    }
    /* If key is a triple, use part 3 as the value, otherwise assume its a count */
    var value = row.key.length === 3 ? row.key[2] : row.value;
    post[row.key[1]] = value;
  }
  outputRow();
}

Note: not tested code!

1
votes

My experience is that in most "normal" cases you are better off having one big document containing both the post and the comments.

Of course, I am aware that it's not a good idea if you have thousands of comments. That's why I said "most normal cases". Don't throw out this option right off, as "improper".

You get all kinds of goodies like being able to count comments count in the map view, easy (one request) retrieval of the whole page from the database, ACID per post (with comments) etc. Plus, you don't need to think about trickeries like view collation right now.

If it gets slow, you can always transform your data structure later on (hell, we used to do it every day with RDBMS).

If your use case is not totally unsuitable for this, I really advise you to try it. It works remarkably well.