I've some performance issue on a quite big data store. For optimizing the insert phase, we created a document store and not a graph, infact the edge creation performance was too slow. Essentially now we have a class A (with about 30M documents) with a link (say field fieldL) to a class B (about 500 documents). The query structure is like:
select from A where field1='field1value' and field2='field2value' and field3>0 ... and fieldL in (select from B where ...)
The first issue i've found is this: I've created n indexes on the n properties engaged in the where condition, but the explain command showed me orient uses only one... https://github.com/orientechnologies/orientdb/issues/3626 So I've created a composite index and if I perform a query involving only the index, say
select from A where field1='field1value' and field2='field2value' and field3>0 the result is really fast
The issue is about the second part of the query, involving the fieldL and the links. I've tried with the [#rid,...] syntax but it seems not perform well.
I've also tried to change the schema using a different approach: class B with multiple links to class A, using a different query pattern (say the field containing the links fieldL1):
select * from (select expand(fieldL1) from B where ...) where field1='field1value' and field2='field2value' and field3>0
In this case the subquery executes a sort of partition of the data, but unfortunatelly we lose the indexes on the result set, so we have really slow performances on the second where clause (field1='field1value' and field2='field2value' and field3>0).
My question is: Does it exist a better query pattern to execute these kind of query faster?
Thank you very much.
By the way during the performance tuning it seems really awkward to perform a count of the documents involved in a query. (https://github.com/orientechnologies/orientdb/issues/3462)