0
votes

I have those two queries, which should - based on my understanding - do basically the same. One is doing a filter on my edge collection and is performing very well, while the other query is doing a graph traversal of depth 1 and performs quite poor, due to not utilizing the correct index.

I have an accounts collection and a transfers collection and a combined index on transfers._to and transfers.quantity.

This is the filter query:

 FOR transfer IN transfers  
    FILTER transfer._to == "accounts/testaccount" && transfer.quantity > 100
    RETURN transfer

Which is correctly using the combined index:

Execution plan:
 Id   NodeType            Est.   Comment
  1   SingletonNode          1   * ROOT
  6   IndexNode       18930267     - FOR transfer IN transfers   /* skiplist index scan */
  5   ReturnNode      18930267       - RETURN transfer

Indexes used:
 By   Type       Collection   Unique   Sparse   Selectivity   Fields                  Ranges
  6   skiplist   transfers    false    false        10.11 %   [ `_to`, `quantity` ]   ((transfer.`_to` == "accounts/testaccount") && (transfer.`quantity` > 100))

Optimization rules applied:
 Id   RuleName
  1   use-indexes
  2   remove-filter-covered-by-index
  3   remove-unnecessary-calculations-2

On the other hand this is my graph traversal query:

 FOR account IN accounts
     FILTER account._id == "accounts/testaccount"

     FOR v, e IN 1..1 INBOUND account transfers
         FILTER e.quantity > 100
         RETURN e

Which only uses _to from the combined index for filtering the inbound edges, but fails to utilize quantity:

Execution plan:
 Id   NodeType          Est.   Comment
  1   SingletonNode        1   * ROOT
  9   IndexNode            1     - FOR account IN accounts   /* primary index scan */
  5   TraversalNode        9       - FOR v  /* vertex */, e  /* edge */ IN 1..1  /* min..maxPathDepth */ INBOUND account /* startnode */  transfers
  6   CalculationNode      9         - LET #7 = (e.`quantity` > 100)   /* simple expression */
  7   FilterNode           9         - FILTER #7
  8   ReturnNode           9         - RETURN e

Indexes used:
 By   Type       Collection   Unique   Sparse   Selectivity   Fields                  Ranges
  9   primary    accounts     true     false       100.00 %   [ `_key` ]              (account.`_id` == "accounts/testaccount")
  5   skiplist   transfers    false    false            n/a   [ `_to`, `quantity` ]   base INBOUND

Traversals on graphs:
 Id   Depth   Vertex collections   Edge collections   Options                                   Filter conditions
  5   1..1                         transfers          uniqueVertices: none, uniqueEdges: path   

Optimization rules applied:
 Id   RuleName
  1   use-indexes
  2   remove-filter-covered-by-index
  3   remove-unnecessary-calculations-2

However, as I want to use the graph traversal, is there a way to utilize this combined index correctly?

Edit: I'm using ArangoDB 3.4.2

1

1 Answers

2
votes

Vertex centric indexes (indexes that are created on an edge and include either the '_from' or the '_to' properties) are normally used in traversals when the filtering is done on the path rather than the edge itself. ( assuming the optimizer does not find a better plan of course)

So in your query, try something like the following:

FOR account IN accounts
 FILTER account._id == "accounts/testaccount"
   FOR v, e IN 1..1 INBOUND account transfers
   FILTER p.edges[*].quantity ALL > 100
RETURN e

You can find the docs about this index type here