I am evaluating ArangoDB (version 3.2.4) as a replacement for MongoDB. We have a huge collection containing 2.700.000 documents. Next year this collection will increase (nearly 4.000.000 documents).
If I want to read data from that collection using the Java driver (version 4.2) it takes a lot of time for the cursor to fetch that data. The time depends on the size of fetched documents, which means, if I want to fetch all documents, it takes about 10 minutes for the cursor to fetch the data:
AQL:
for doc in myHugeCollection
RETURN { "name": doc.name }
Java code:
AqlQueryOptions aqlQueryOptions = new AqlQueryOptions();
aqlQueryOptions.batchSize(500);
aqlQueryOptions.count(false);
aqlQueryOptions.cache(true);
ArangoCursor<MyHugeCollection> arangoCursor = arangoDatabase.query(
aqlQuery,
new HashMap<>(),
aqlQueryOptions,
MyHugeCollection.class);
This will take about 10 minutes until I am able to access the data via the cursor. And because I set the batch size to 500 my expectation was a quick response, because fetching the first 500 results is extremely fast.
modified AQL fetching first 500 documents:
for doc in myHugeCollection
limit 500
RETURN { "name": doc.name }
This query will take about 20 ms.
So, my question is what am I doing wrong? How can I access data in a huge collection without waiting minutes for the cursor?