1
votes

I’m trying to "avoid updates" in CouchDB which seems to be the recommendation. But I'm having trouble with being able to make a "reduce" view that returns only the latest value. In the data set below, the document a8f2298e5961b0ebf60e56022d253d2b (where s=FooQuux) should basically never be returned, and could be deleted as a "cleanup" operation without impact.

I'd like to be able to do operations such as:

  • Retrieve the value for a ==> Bar
    • this technique could work, using the s view, searching for [["a"], ["a",{}]] with limit=1.
  • Retrieve all values ==> a=Bar, b=Quux
    • Can I write a reduce to "strip out" documents that have an older timestamp?

And for the last two, I'm not sure if it's possible to write one map/reduce view that can do both, given the constraints.

  • Get the string length for each user - group=1 == a=3, b=4 (not counting FooQuux)
  • Get the string length for all users - group=None == 7 (not counting FooQuux)

Data: (this is a contrived example)

Given data: (where t is a timestamp, u is a username, s is a string)

  • at timestamp 0, user a set value FooQuux
  • at timestamp 1, user a set value Bar
  • at timestamp 2, user b set value Quux

Database:

{
  "_id": "_design/all",
  "views": {
    "s": {
      "map": "function(doc) { if(doc.u) { emit([doc.u, doc.s, doc.t], doc.s); } }"
    },
    "slen": {
      "map": "function(doc) { if(doc.u) { emit([doc.u, doc.s, doc.t], doc.s.length); } }",
      "reduce": "_sum"
    }
  },
  "language": "javascript"
}

{
  "_id": "a8f2298e5961b0ebf60e56022d251ebd",
  "t": 2,
  "u": "b",
  "s": "Quux"
}

{
  "_id": "a8f2298e5961b0ebf60e56022d253a1b",
  "t": 1,
  "u": "a",
  "s": "Bar"
}

{
  "_id": "a8f2298e5961b0ebf60e56022d253d2b",
  "t": 0,
  "u": "a",
  "s": "FooQuux"
}
2
OK, so I haven't tried this yet… but I just had the idea that I could make a map like above ([u, s, t]) but with the document something like: { a: { t: 1, s: "Bar" } } … and then the reduce could collect the keys together, throwing out the old ones. So group=None would be { a: { s:Bar, t:1 }, b: {s:Quux, t:2} } and other group levels should work as expected… But for fetching a single "u", the limit=1 approach above would be best, as it wouldn't require a reduce. amirite?Steven R. Loomis

2 Answers

3
votes

By creating a Map function like this:

function(doc) { emit([doc.u, doc.t], doc.s); }

You can use the view in two ways:

  • to aggregate the data - get counts of the number of documents for values of u (with the _count reducer and ?group_level=1)
  • to select the newest record for a value of u - (reduce=false&endkey=["a"]&startkey=["az"]&descending=true&limit1)

As the documents are processed separately during indexing, it is impossible to do logic that requires you to know about the contents of another document. i.e. you can prevent a document being indexed at index time, if(doc.live){ emit(doc.u,null)} but you can't do "cross-document" logic.

0
votes

not trying to pull a LEGO but here’s a possible solution:

Map

per @Glynn-Bird in https://stackoverflow.com/a/39128241/185799

function(doc) {
  emit([doc.u, doc.t], doc.s);
}

Reduce

function(keys, values, rereduce){
  function rollin(map, u, o) {
    if((!map[u]) || (o.t > map[u].t)) {
        return o;
    } else {
        return map[u];
    }
    return map;
  }
  var m = {};
  for(var i=0;i<values.length;i++) {
      var v = values[i];
      if(rereduce) {
          // merge in this chunk
          for(var u in v) {
            m[u] = rollin(m, u, v[u]);
          }
      } else {
          // roll in single entries
          m[keys[i][0][0]] = rollin(m, keys[i][0][0], {
              s: v,
              t: keys[i][0][1],
          });
      }
  }
  return m;
}

Output

reduce=false, startkey ["a",{}] endkey ["a"], descending, limit 1

This is for "selecting a single specified entry"

{
  "value": "Bar",
  "key": [
    "a",
    1
  ],
  "id": "a8f2298e5961b0ebf60e56022d253a1b"
}

Reduce, Group level 1

This is for listing "specific values" (chosen by query)

{
  "value": {
    "b": {
      "s": "Quux",
      "t": 2
    }
  },
  "key": [
    "b"
  ]
}

{
  "value": {
    "a": {
      "s": "Bar",
      "t": 1
    }
  },
  "key": [
    "a"
  ]
}

Reduce, Group level None

This is for dumping out all items.

{
  "value": {
    "a": {
      "s": "Bar",
      "t": 1
    },
    "b": {
      "s": "Quux",
      "t": 2
    }
  },
  "key": null
}