0
votes

Essentially, I'm storing a directed graph of entities in CouchDB, and need to be able to find edges going IN and OUT of the graph.

SETUP:

The way the data is being stored right now is as follows. Each document represents a RELATION between two entities:

doc: {
    entity1: { name: '' ... },
    entity2: { name: '' ... }
    ...
}

I have a view which does a bunch of emits, two of which emit documents keyed on their entity1 component and on their entity2 component, so something like:

function() {
    emit(['entity1', doc.entity1.name]);
    emit(['entity2', doc.entity2.name]);
}

Edges are directed, and go from entity1 and entity2. So if I want to find edges going out of an entity, I just query the first emit; if I want edges going into an entity, I query the second emit.

PROBLEM:

The problem here lies in the fact that I also have the need to capture edges both going INTO and OUT OF entities. Is there a way I can group or reduce these two emits into a single bi-directional set of [x] UNIQUE pairs?

Is there a better way of organizing my view to promote this action?

1

1 Answers

1
votes

It might be preferable to just create a second view. But there's nothing stopping you from cramming all sorts of different data into the same view like so:

function() {
    if (doc.entity1.name == doc.entity2.name) {
      emit(['self-ref', doc.entity1.name], 1);
    }
    emit(['both'   [doc.entity1.name, doc.entity2.name]], 1);
    emit(['either' [doc.entity1.name, "out"]], 1);
    emit(['either' [doc.entity2.name, "in"]], 1);
    emit(['out', doc.entity1.name], 1);
    emit(['in', doc.entity2.name], 1);
}

Then you could easily do the following:

  • find all the self-ref's:
    • startkey=["self-ref"]&endkey=["self-ref", {}].
  • find all of the edges (incoming or outgoing) for a particular node:
    • startkey=["either", [nodeName]]&endkey=["either", [nodeName, {}]]
    • if you don't reduce this, then you'll still be preserving "in" vs "out" in the key. If you never need to query for all nodes with incoming or outgoing edges, then you can replace the last two emits with the "either" emits.
  • find all of the edges from node1 -> node2:
    • key=["both", [node1, node2]

as well as your original queries for incoming or outgoing for a particular node.

I'd recommend benchmarking your application's typical use cases before choosing between this combined view approach or a multi-view approach.