0
votes

I have a query that runs well in single-instance setup. However, when I tried to run it on a sharded cluster, the performance dropped (4x longer execution time).

The query plan shows that practically all processing is done on Coordinator node, not on DbServer. How to push the query to be executed at DbServer?

To give a bit of a context: I have a collection of ~120k (will grow to several millions) of multi-level JSON documents with nested arrays. And the query needs to unnest these arrays before getting to the proper node.

AQL Query:

for doc IN doccollection
for arrayLevel1Elem in doc.report.container.children.container
for arrayLevel2Elem in arrayLevel1Elem.children.container.children.num
for arrayLevel3Elem in arrayLevel2Elem.children.code

filter doc.report.container.concept.simpleCodedValue == 'A' 
filter arrayLevel1Elem.concept.codedValue == "B"
filter arrayLevel2Elem.concept.simpleCodedValue == "C"   
filter arrayLevel3Elem.concept.simpleCodedValue == 'X'
filter arrayLevel3Elem.value.simpleCodedValue == 'Y'     

collect studyUid = doc.report.study.uid, personId = doc.report.person.id, metricName = arrayLevel2Elem.concept.meaning, value = to_number(arrayLevel2Elem.value)

return {studyUid, personId, metricName, value}

Query Plan:

 Id   NodeType                  Site          Est.   Comment
  1   SingletonNode             DBS              1   * ROOT
  2   EnumerateCollectionNode   DBS         121027     - FOR doc IN doccollection   /* full collection scan, projections: `report`, 2 shard(s) */   FILTER (doc.`report`.`container`.`concept`.`simpleCodedValue` == "A")   /* early pruning */
  3   CalculationNode           DBS         121027       - LET #8 = doc.`report`.`container`.`children`.`container`   /* attribute expression */   /* collections used: doc : doccollection */
 19   CalculationNode           DBS         121027       - LET #24 = doc.`report`.`study`.`uid`   /* attribute expression */   /* collections used: doc : doccollection */
 20   CalculationNode           DBS         121027       - LET #26 = doc.`report`.`person`.`id`   /* attribute expression */   /* collections used: doc : doccollection  */
 29   RemoteNode                COOR        121027       - REMOTE
 30   GatherNode                COOR        121027       - GATHER   /* parallel, unsorted */
  4   EnumerateListNode         COOR      12102700       - FOR arrayLevel1Elem IN #8   /* list iteration */
 11   CalculationNode           COOR      12102700         - LET #16 = (arrayLevel1Elem.`concept`.`codedValue` == "B")   /* simple expression */
 12   FilterNode                COOR      12102700         - FILTER #16
  5   CalculationNode           COOR      12102700         - LET #10 = arrayLevel1Elem.`children`.`container`.`children`.`num`   /* attribute expression */
  6   EnumerateListNode         COOR    1210270000         - FOR arrayLevel2Elem IN #10   /* list iteration */
 13   CalculationNode           COOR    1210270000           - LET #18 = (arrayLevel2Elem.`concept`.`simpleCodedValue` == "C")   /* simple expression */
 14   FilterNode                COOR    1210270000           - FILTER #18
  7   CalculationNode           COOR    1210270000           - LET #12 = arrayLevel2Elem.`children`.`code`   /* attribute expression */
 21   CalculationNode           COOR    1210270000           - LET #28 = arrayLevel2Elem.`concept`.`meaning`   /* attribute expression */
 22   CalculationNode           COOR    1210270000           - LET #30 = TO_NUMBER(arrayLevel2Elem.`value`)   /* simple expression */
  8   EnumerateListNode         COOR  121027000000           - FOR arrayLevel3Elem IN #12   /* list iteration */
 15   CalculationNode           COOR  121027000000             - LET #20 = ((arrayLevel3Elem.`concept`.`simpleCodedValue` == "X") && (arrayLevel3Elem.`value`.`simpleCodedValue` == "Y"))   /* simple expression */
 16   FilterNode                COOR  121027000000             - FILTER #20
 23   CollectNode               COOR   96821600000             - COLLECT studyUid = #24, personId = #26, metricName = #28, value = #30   /* hash */
 26   SortNode                  COOR   96821600000             - SORT studyUid ASC, personId ASC, metricName ASC, value ASC   /* sorting strategy: standard */
 24   CalculationNode           COOR   96821600000             - LET #32 = { "studyUid" : studyUid, "personId" : personId, "metricName" : metricName, "value" : value }   /* simple expression */
 25   ReturnNode                COOR   96821600000             - RETURN #32

Thanks a lot for any hint.

1

1 Answers

0
votes

Queries are not actually executed at the DB server - the coordinators handle query compilation and execution, only really asking the DB server(s) for data.

This means memory load for query execution happens on the coordinators (good!) but that the coordinator has to transport (sometimes LARGE amounts of) data across the network. This is THE BIGGEST downside to moving to a cluster - and not one that is easily solved.

I walked this same road in the beginning and found ways to optimize some of my queries, but in the end, it was easier to go with a "one-shard" cluster or an "active-failover" setup.

It's tricky to make architecture suggestions because each use case can be so different, but there are some general AQL guidelines I follow:

  1. Collecting FOR and FILTER statements is not recommended (see #2). Try this version to see if it runs any faster (and try indexing report.container.concept.simpleCodedValue) :
FOR doc IN doccollection
    FILTER doc.report.container.concept.simpleCodedValue == 'A'
    FOR arrayLevel1Elem in doc.report.container.children.container
        FILTER arrayLevel1Elem.concept.codedValue == 'B'
        FOR arrayLevel2Elem in arrayLevel1Elem.children.container.children.num
            FILTER arrayLevel2Elem.concept.simpleCodedValue == 'C'
            FOR arrayLevel3Elem in arrayLevel2Elem.children.code
                FILTER arrayLevel3Elem.concept.simpleCodedValue == 'X'
                FILTER arrayLevel3Elem.value.simpleCodedValue == 'Y'
                COLLECT
                    studyUid = doc.report.study.uid,
                    personId = doc.report.person.id,
                    metricName = arrayLevel2Elem.concept.meaning,
                    value = to_number(arrayLevel2Elem.value)
                RETURN { studyUid, personId, metricName, value }
  1. The FOR doc IN doccollection pattern will recall the ENTIRE document from the DB server for each item in doccollection. Best practice is to either limit the number of documents you are retrieving (best done with an index-backed search) and/or return only a few attributes. Don't be afraid of using LET - in-memory on the coordinator can be faster than in-memory on the DB. This example does both - filters and returns a smaller set of data:
LET filteredDocs = (
    FOR doc IN doccollection
        FILTER doc.report.container.concept.simpleCodedValue == 'A'
        RETURN {
            study_id: doc.report.study.uid,
            person_id: doc.report.person.id,
            arrayLevel1: doc.report.container.children.container
        }
)
FOR doc IN filteredDocs
    FOR arrayLevel1Elem in doc.arrayLevel1
        FILTER arrayLevel1Elem.concept.codedValue == 'B'
        ...