If I have a tree modelled in Neo4J, and I know all the indexed node properties of a possible subtree, how can I quickly do a match to confirm if that subtree exists?
For an easy to understand example, if my tree is a representation of all the directories and files on a disk, and what I want to confirm is the existence of a file, here's what I'm doing now to test if /a/b/c/d/e/f/g/h/i/j/k/file.txt exists in a graph where my filesystem nodes have label mylabel
and have an indexed basename property:
MATCH (root:`mylabel` { basename: '/' })-[:contains]->(`1`:`mylabel` { basename: 'a' })-[:contains]->(`2`:`mylabel` { basename: 'b' })-[:contains]->(`3`:`mylabel` { basename: 'c' })-[:contains]->(`4`:`mylabel` { basename: 'd' })-[:contains]->(`5`:`mylabel` { basename: 'e' })-[:contains]->(`6`:`mylabel` { basename: 'f' })-[:contains]->(`7`:`mylabel` { basename: 'g' })-[:contains]->(`8`:`mylabel ` { basename: 'h' })-[:contains]->(`9`:`mylabel ` { basename: 'i' })-[:contains]->(`10`:`mylabel` { basename: 'j' })-[:contains]->(`11`:`mylabel` { basename: 'k' })-[:contains]->(leaf:`vdp`:`mylabel` { basename: 'file.txt' }) RETURN leaf
I also happen to have an absolute path property on the leaf node, but I can't just search for that because I want to use a similar query to do a merge if I'm adding a file, and in case the file has moved (by deleting the relationship of leaf to k, and adding a new relationship of leaf to some other directory node) without the path property getting updated. Also, I can have the same absolute path multiple times for different files, because it can be for different disks (this is modelled by the root node having a basename that includes the disk name, but that isn't seen in the leaf nodes path property).
The above query takes 2-7 seconds to run (uncached). Is there any way to do something like this quicker? When I specify every node in the path I hoped Neo4J could cope even with paths of length 12+.