0
votes

I'm designing a solution that uses ArangoDB and will need to have single Edge Collections that connect to between 5 and 200 Vertex Collections.

Each Vertex Collection will have between 1 and 180 Edge Collections bound to them.

Each Edge Collection will have a Graph object created for it.

I'm new to ArangoDB and am interested if there are some key performance impacts that I need to be aware of.

Server hardware shouldn't be a problem, as it would be possible to utilise larger server instances on cloud providers.

I'm more interested in the performance of ArangoDB with Edge Collections referencing so many shared Vertex Collections, as well as any other issues aren't so obvious.

The current version of ArangoDB I'm using is 2.8.2.

Thanks!

1

1 Answers

3
votes

For the performance side there are the following factors: Not Using Graphs:

  1. Adding edges to vertices in as many collections as you like has no overhead.
  2. Every collection has an overhead by itself, it uses it's own datafiles etc.
  3. Deleting a Vertex/Edge with AQL or Document API directly is not affected by the total amount of connected collections. (NOTE: in this case edges pointing to this documents will not be removed!)

Using Graphs: Whenever you delete a vertex through the graph API the following will happen:

  1. The Vertex is deleted (constant time)
  2. All edges to this vertex in the edge collections known to this graph are removed (scans through all edge definitions and there through all from and all to definitions if the vertex is potentially connected here. If so it will do an index lookup for all edges to this vertex and remove them.
  3. Next it will scan through all other graphs and for each of them check if the collection is part of one edge definition.

So from my understanding the delete operation in your case will be extremely expensive. Insertion/Update/Lookup/Queries are not affected by the amount of connected collections.

However i think having so many graphs and so many collections seems to be a bit over engineered, but as i do not know details of your use-case i cannot judge if it is necessary or not.