I am using neo4j-2.2.1 and using Transactional Cypher Restful Endpoint to query. I am trying to match email nodes using the following query
match (e:Email) where e.email in ['gxxxxxxs@yyy.com'] return count(e);
The email property in Email node has a unique constraint on it and therefore has a index built automatically around it. The above query took 23 seconds when I ran it using query parameters as in {"statements": [{"parameters": {"emails": ["gxxxxxxs@yyy.com"]}, "statement": "match (c:Email) where c.email in {emails} return count(c)"}]}
where as it took only 0.0134 seconds when I ran it directly without any parameters as in
{"statements": [{"statement": "match (c:Email) where c.email in ['gxxxxxxs@yyy.com'] return count(c)"}]}
I tried to profile it to see how the query is executed and to my surprise, the parameterized query was not using the unique index seek which I expected it to use.
The results of the profiling as given below -
Profile of Non-Parameterized Query
{"statements": [{"statement": "PROFILE match (c:Email) where c.email in ['gxxxxxxs@yyy.com'] return count(c)"}]}
{"results":[{"columns":["count(c)"],"data":[{"row":[1]}],"plan":{"root":{"operatorType":"EagerAggregation","DbHits":0,"Rows":1,"version":"CYPHER 2.2","KeyNames":"","EstimatedRows":1.0000000000279299,"planner":"COST","identifiers":["count(c)"],"children":[{"operatorType":"NodeUniqueIndexSeek","Index":":Email(email)","Rows":1,"DbHits":1,"EstimatedRows":1.00000000005586,"identifiers":["c"],"children":[]}]}}}],"errors":[]}
Finished in 0.0134048461914 sec
Profile of Parameterized Query
{"statements": [{"parameters": {"emails": ["gxxxxxx@yyy.com"]}, "statement": "PROFILE match (c:Email) where c.email in {emails} return count(c)"}]} {"results":[{"columns":["count(c)"],"data":[{"row":[1]}],"plan":{"root":{"operatorType":"EagerAggregation","DbHits":0,"Rows":1,"version":"CYPHER 2.2","KeyNames":"","EstimatedRows":1384.8193022918188,"planner":"COST","identifiers":["count(c)"],"children":[{"operatorType":"Filter","LegacyExpression":"any(--INNER-- in {emails} where c.email == --INNER--)","Rows":1,"DbHits":5114522,"EstimatedRows":1917724.4999999998,"identifiers":["c"],"children":[{"operatorType":"NodeByLabelScan","LabelName":":Email","Rows":2557261,"DbHits":2557262,"EstimatedRows":2556966.0,"identifiers":["c"],"children":[]}]}]}}}],"errors":[]}
Finished in 23.5868499279 sec
Can someone please help me understand why a parameterized cypher query is not using the unique index seek