Efficient ArangoDB AQL query that can find all Edges in Collection that go To and From the same Documents

Question

I have an ArangoDB with two collections, one for Documents, and one for Edges, I'd like an efficient query capable of only returning the Edges that share the same _to and _from values.

For example, assume my Documents Collection contains A, B, C, and D. Let's also assume my Edges Collection contains X, Y, and Z.

Edge X is _from A and _to B.

Edge Y is _from A and _to B as well.

Edge Z is _from C and _to D.

However, I do not know that X and Y are basically the same Edge, and I don't know that the Documents that share similar Edges are A & B. My query is to seek out these duplicate Edges.

What I have so far looks like this:

FOR ec1 IN edge_collection
    FOR ec2 IN edge_collection
        FILTER ec1._key != ec2._key AND ec1._to == ec2._to AND ec1._from == ec2._from
        RETURN ec1

This seems to work, though it also feels terribly inefficient. Is there a better way to go about doing this, or is this the best possible solution with AQL?

CodeManX CodeManX · Accepted Answer · 2021-04-23T10:09:01

You can group by _from and _to, count how many edges there are per group, and filter out the unique combinations:

FOR ec IN edge_collection
  COLLECT from = ec._from, to = ec._to WITH COUNT INTO count
  FILTER count > 1
  RETURN { from, to, count }

Or if you want to return the edge keys as well:

FOR ec IN edge_collection
  COLLECT from = ec._from, to = ec._to INTO edges = ec._key
  LET count = LENGTH(edges)
  FILTER count > 1
  RETURN { from, to, count, edges }

Alternatively using an aggregation instead of a post-calculation:

FOR ec IN edge_collection
  COLLECT from = ec._from, to = ec._to AGGREGATE count = LENGTH(1) INTO edges = ec._key
  FILTER count > 1
  RETURN { from, to, count, edges }

To return the full edges use INTO edges = ec instead. You could also use just INTO edges but then each edge will be nested in an object {"ec": … }.

Efficient ArangoDB AQL query that can find all Edges in Collection that go To and From the same Documents

2 Answers