2
votes

I have a Neo4J database with the following properties:

  • Array Store 8.00 KiB
  • Logical Log 16 B
  • Node Store 174.54 MiB
  • Property Store 477.08 MiB
  • Relationship Store 3.99 GiB
  • String Store Size 174.34 MiB
  • MiB Total Store Size 5.41 GiB

There are 12M nodes and 125M relationships.

So you could say this is a pretty large database.

My OS is windows 10 64bit, running on an Intel i7-4500U CPU @1.80Ghz with 8GB of RAM. This isn't a complete powerhouse, but it's a decent machine and in theory the total store could even fit in RAM.

However when I run a very simple query (using the Neo4j Browser)

MATCH (n {title:"A clockwork orange"}) RETURN n;

I get a result:

Returned 1 row in 17445 ms.

I also used a post request with the same query to http://localhost:7474/db/data/cypher, this took 19seconds.

something like this: http://localhost:7474/db/data/node/15000 is however executed in 23ms...

And I can confirm there is an index on title:

Indexes
ON :Page(title) ONLINE 

So anyone have ideas on why this might be running so slow?

Thanks!

1

1 Answers

6
votes

This has to scan all nodes in the db - if you re-run your query using n:Page instead of just n, it'll use the index on those nodes and you'll get better results.


To expand this a bit more - INDEX ON :Page(title) is only for nodes with a :Page label, and in order to take advantage of that index your MATCH() needs to specify that label in its search.

If a MATCH() is specified without a label, the query engine has no "clue" what you're looking for so it has to do a full db scan in order to find all the nodes with a title property and check its value.

That's why

MATCH (n {title:"A clockwork orange"}) RETURN n;

is taking so long - it has to scan the entire db.

If you tell the MATCH() you're looking for a node with a :Page label and a title property -

MATCH (n:Page {title:"A clockwork orange"}) RETURN n;

the query engine knows you're looking for nodes with that label, it also knows that there's an index on that label it can use - which means it can perform your search with the performance you're looking for.