0
votes

I have a big amounts of nodes that have outgoing relations to even bigger amount of nodes. I want to be able to query for a limited amount of starting nodes, returning with it the related nodes, but the related nodes should also be limited in numbers.

Is this possible in neo4j 1.9?

For example create these nodes and have an auto index on name:

CREATE p = (bar{company:'Bar1'})<-[:FREQUENTS]-(andres {name:'Andres'})-[:WORKS_AT]->(neo{company:'Neo1'}) 
WITH andres 
CREATE (restaurant{company:'Restaurant1'})<-[:FREQUENTS]-(andres)-[:WORKS_AT]-(lib{company:'Library'}) ;

CREATE p = (bar{company:'Bar2'})<-[:FREQUENTS]-(todd {name:'Todd'})-[:WORKS_AT]->(neo{company:'Neo2'}) 
WITH todd 
CREATE (restaurant{company:'Restaurant2'})<-[:FREQUENTS]-(todd)-[:WORKS_AT]-(lib{company:'Library2'}) ;

CREATE p = (bar{company:'Bar3'})<-[:FREQUENTS]-(hank {name:'Hank'})-[:WORKS_AT]->(neo{company:'Neo3'}) 
WITH hank 
CREATE (restaurant{company:'Restaurant3'})<-[:FREQUENTS]-(hank)-[:WORKS_AT]-(lib{company:'Library3'}) ;

What I would like is something like:

START p=node:node_auto_index('*:*') 
MATCH p-[:WORKS_AT]-> c, p-[:FREQUENTS]-> f 
RETURN p, collect(distinct c.company), collect(distinct f.company) LIMIT 2;

To return 2 rows and have the collections limited to one, but without using the function on the collections, tried that on a large data set and it becomes extremely slow. So some way to LIMIT the matches..

If this is not possible in neo4j 1.9, would there be a solution in neo4j 2.0?

2

2 Answers

1
votes

Can you try something like this:

START p=node:node_auto_index('*:*') 
RETURN p, 
     head(extract(path in p-[:WORKS_AT]->() : head(tail(nodes(path))))) as work_company,
     head(extract(path in p-[:FREQUENTS]->() : head(tail(nodes(path))))) as visit_company

The head function on the extracted path node should be lazy so it pulls only the first one from the pattern match

If you look at the profiling output you should see that it touches only the first node each.

0
votes

It could be that the : query triggers some very large operations in the indexing layer, rather than being lazy.. I would try something like this:

START p=node:node_auto_index('*:*') 
WITH p LIMIT 2
MATCH p-[:WORKS_AT]-> c, p-[:FREQUENTS]-> f return p, collect(distinct c.company), collect(distinct f.company)