0
votes

I need to perform a Neo4J Cypher query where I exclude the nodes from the result that are connected to a specific node.

I'm using that query below, but it's very slow (several seconds).

What I do below is to first get the node (of the Context type), which the other nodes should not be connected to.

Then I make the next query, finding the nodes, created by the same user but in another context "help" and exclude those of them that are also connected to the "private" context.

MATCH (u:User{uid:'6dbe5450-852d-11e4-9c48-b552fc8c2b90'}), 
(ctxa:Context{name:"private"}), ctxa-[:BY]->u 
WITH ctxa,u MATCH (s:Statement), (ctx:Context{name:"help"}), ctx-[:BY]->u, 
s-[:IN]->ctx, s-[:BY]->u 
WHERE NOT s-[:IN]->ctxa 
RETURN DISTINCT s.uid ORDER BY s.timestamp DESC;

Anyone has a better idea?

UPDATE: I also tried this:

MATCH (u:User{uid:'b9745f70-a13f-11e3-98c5-476729c16049'}), (s:Statement), 
(ctx:Context{name:"private"}), ctx-[:BY]->u, s-[:IN]->ctx, s-[:BY]->u 
WITH s AS sta, u MATCH (s:Statement), (ctx:Context{name:"help"}), 
ctx-[:BY]->u, s-[:IN]->ctx, s-[rel:BY]->u WHERE s <> sta RETURN DISTINCT s; 
1
deemeetree, I think you could probably reduce the complexity of your queries to arrive at something a little "leaner". Can you explain your data model a bit and the goal of the query? - Dave Bennett
@DaveBennett here is the description of the data model: noduslabs.com/cases/graph-database-structure-specification - it kind of has to be this way, because i use it for text network analysis app github.com/noduslabs/infranodus – if you have any ideas how to make it more simple, i would really appreciate your help! - Aerodynamika

1 Answers

0
votes

Found that that using DISTINCT in the query really improved the speed:

MATCH (u:User{uid:'b9745f70-a13f-11e3-98c5-476729c16049'}), 
(ctxa:Context{name:"private"}), ctxa-[:BY]->u 
WITH DISTINCT ctxa,u 
MATCH (s:Statement), s-[rel:BY]->u, (ctx:Context{name:"bodypractices"}), 
ctx-[:BY]->u, s-[:IN]->ctx  
WHERE NOT s-[:IN]->ctxa 
RETURN DISTINCT s;

Basically the problem was that I was not using DISTINCT with the first WITH clause and because of the structure of my database (or maybe for another reason) I was getting a lot of rows for ctxa and then each was checked in the WHERE statement, slowing everything down.