0
votes

I am having a problem with a null result on a query with multiple optional matches.

//Match gs to searched w
MATCH (w1:W {name: "****"})-[:REL]->(gs:G)
WITH w1, COLLECT(DISTINCT gs) AS gsCol, SIZE((w1)-[:REL]-()) AS gCount
OPTIONAL MATCH (w1)-[:REL]-()-[:SIMILAR*0..1]->(gs:G)
WITH w1, gsCol, gCount, COLLECT(DISTINCT gs) AS similarGs
//Match all ws that contain gs in searched w or where similar as wsCol
OPTIONAL MATCH (w1)-[c2a:REL]->(g4:G)-[c2b:REL|:SIMILAR*0..1]-(ws:W)
WHERE c2a.amount - 10 < last(c2b).amount < c2a.amount + 10
WITH w1, gsCol, similarGs, gCount, COLLECT(DISTINCT ws) AS ws2, COLLECT(DISTINCT ws) AS ws3, COLLECT(DISTINCT ws) AS ws4
//Match ws from wsCol where all gs in new matched ws are same
UNWIND ws2 as w2
OPTIONAL MATCH (w2)-[c3:REL]->(g3:G)
WITH w1, w2, ws3, ws4, gsCol, similarGs, gCount, COLLECT(g3) AS gs3, SIZE((w2)-[:REL]->()) as gCount3, SUM(c3.amount) AS c3amount
WHERE ALL(x in gs3 WHERE x IN gsCol)
WITH w1, w2, ws3, ws4, gsCol, similarGs, gCount, gCount3, c3amount
WHERE gCount3 = gCount AND c3amount = 100
WITH COLLECT(w2) AS ws2Col, w1, ws3, ws4, gsCol, similarGs, gCount
//Match ws with gs that are in searched or similar to searched w
UNWIND ws3 as w3
WITH w1, w3, ws4, gsCol, similarGs, gCount, ws2Col
OPTIONAL MATCH (w3)-[c4:REL]->(g4:G)
WITH w1, w3, ws4, ws2Col, gsCol, similarGs, gCount, COLLECT(g4) AS gs4, SIZE((w3)-[:REL]->()) AS gCount4, SUM(c4.amount) AS c4amount
WHERE ALL(x in gs4 WHERE x in similarGs)
WITH w1, w3, ws4, ws2Col, gsCol, similarGs, gCount, gs4, gCount4
WHERE gCount4 = gCount AND c4amount = 100 AND NOT(w3 IN ws2Col)
WITH COLLECT(w3) AS ws3Col, w1, w3, ws4, ws2Col, gsCol, gCount, similarGs
//Match ws where depending on number of gs in w 1 or 2+ gs match searched w
UNWIND ws4 AS w4
OPTIONAL MATCH (w4)-[c5b:REL]->(g5:G)
WITH w1, w4, ws2Col, ws3Col, gsCol, similarGs, gCount, sum(c5b.amount) AS c6amount, SIZE((w4)-[:REL]-()) as gCount5, collect(g5) AS gs5, max(c5b.amount) as c6max
WHERE ALL(x IN gs5 WHERE x IN gsCol) AND (CASE WHEN gCount > 2 THEN c6amount > 25 ELSE c6amount > 65 END) AND NOT(w4 in ws2Col) AND NOT(w4 in ws3Col)
WITH  COLLECT(w4) AS ws4Col, w1, ws2Col, ws3Col, w4, gsCol, similarGs, gCount, c6amount, gCount5, gs5, c6max
UNWIND ws2Col AS ws2a UNWIND ws3Col AS ws3a UNWIND ws4Col AS ws4a
RETURN collect(distinct ws2a) AS match1, collect(distinct ws3a) AS match2, collect(distinct ws4a) AS match3

There are times in this query where w2, w3 or w4 can return null, it is expected behaviour, but when any of these are null the entire result is null or

╒════════╤════════╤════════╕
│"match1"│"match2"│"match3"│
╞════════╪════════╪════════╡
│[]      │[]      │[]      │
└────────┴────────┴────────┘

I am expecting to see some results in match1 and/or match3 if match2 is null.

I have tried running the query without the collect(w2), collect(w3) and collect(w4), but this just causes the query to timeout or exhaust the heap size.

Can anyone suggest a way of avoiding the optional match returning null dropping everything in the query or returning null for the other optional matches?

EDIT 1 --

Have found the point at which the query can break... during the 2nd where of the optional match ..

AND NOT(w3 IN ws2Col)

even when I run a return at this point the ws2Col returns null if w3 is null

EDIT 2 --

@BrunoPeres answer is almost there and took a big step to getting closer. Had to change the 2nd and 3rd COLLECT to FILTER for the query to not drop these collections, if one of the others is null. Here is the final query for those who might come across this.

//Match gs to searched w
MATCH (w1:W {name: "****"})-[:CONTAINS]->(gs:G)
WITH w1, COLLECT(DISTINCT gs) AS gsCol, SIZE((w1)-[:CONTAINS]-()) AS gCount
OPTIONAL MATCH (w1)-[:CONTAINS]-()-[:SIMILAR*0..1]->(gs:G)
WITH w1, gsCol, gCount, COLLECT(DISTINCT gs) AS similarGs
//Match all ws that contain gs in searched w or where similar as wsCol
OPTIONAL MATCH (w1)-[c2a:CONTAINS]->(g4:G)-[c2b:CONTAINS|:SIMILAR*0..1]-(ws:W)
WHERE c2a.amount - 10 < last(c2b).amount < c2a.amount + 10
WITH w1, gsCol, similarGs, gCount, COLLECT(DISTINCT ws) AS ws2, COLLECT(DISTINCT ws) AS ws3, COLLECT(DISTINCT ws) AS ws4
//Match ws from wsCol where all gs in new matched ws are same
UNWIND ws2 as w2
OPTIONAL MATCH (w2)-[c3:CONTAINS]->(g3:G)
WITH w1, w2, ws3, ws4, gsCol, similarGs, gCount, COLLECT(g3) AS gs3, SIZE((w2)-[:CONTAINS]->()) as gCount3, SUM(c3.amount) AS c3amount
WHERE ALL(x in gs3 WHERE x IN gsCol)
WITH w1, w2, ws3, ws4, gsCol, similarGs, gCount, gCount3, c3amount
WHERE gCount3 = gCount AND c3amount = 100
WITH COLLECT(w2) ELSE ['none'] END AS ws2Col, w1, ws3, ws4, gsCol, similarGs, gCount
//Match ws with gs that are in searched or similar to searched w
UNWIND ws3 as w3
WITH w1, w3, ws4, gsCol, similarGs, gCount, ws2Col
OPTIONAL MATCH (w3)-[c4:CONTAINS]->(g4:G)
WITH w1, w3, ws4, ws2Col, gsCol, similarGs, gCount, COLLECT(g4) AS gs4, SIZE((w3)-[:CONTAINS]->()) AS gCount4, SUM(c4.amount) AS c4amount
WHERE ALL(x in gs4 WHERE x in similarGs)
WITH w1, w3, ws4, ws2Col, gsCol, similarGs, gCount, gs4, gCount4, c4amount
WHERE gCount4 = gCount AND c4amount = 100 AND NOT(w3 IS NULL OR w3 IN ws2Col)
WITH CASE WHEN NOT(w3 IS NULL) THEN COLLECT(w3) ELSE ['none'] END AS ws3Col, w1, w3, ws4, ws2Col, gsCol, gCount, similarGs
//Match ws where depending on number of gs in w 1 or 2+ gs match searched w
UNWIND ws4 AS w4
OPTIONAL MATCH (w4)-[c5b:CONTAINS]->(g5:G)
WITH w1, w4, ws2Col, ws3Col, gsCol, similarGs, gCount, sum(c5b.amount) AS c6amount, SIZE((w4)-[:CONTAINS]-()) as gCount5, collect(g5) AS gs5, max(c5b.amount) as c6max, ws2Col + ws3Col AS wsC
WHERE ALL(x IN gs5 WHERE x IN gsCol) AND (CASE WHEN gCount > 2 THEN c6amount > 25 ELSE c6amount > 65 END) AND NOT(w4 in ws2Col OR w4 in ws3Col)
WITH CASE WHEN w4 IS NULL THEN ['none'] ELSE COLLECT(w4) END AS ws4Col, w1, ws2Col, ws3Col, w4, gsCol, similarGs, gCount, c6amount, gCount5, gs5, c6max
// Return results
UNWIND (CASE ws2Col WHEN [] THEN [null] ELSE ws2Col END) AS ws2a
UNWIND (CASE ws3Col WHEN [] THEN [null] ELSE ws3Col END) AS ws3a
UNWIND (CASE ws4Col WHEN [] THEN [null] ELSE ws4Col END) AS ws4a
RETURN collect(distinct ws2a) AS match1, collect(distinct ws3a) AS match2, collect(distinct ws4a) AS match3
1

1 Answers

1
votes

According the docs The in operator and null, when you test if null is IN a given list the return will be null:

So the return of the below expression will be null:

RETURN null IN [1, 2, 3]

╒═══════════════════╕
│"null IN [1, 2, 3]"│
╞═══════════════════╡
│null               │
└───────────────────┘

Consequently the return of the expression NOT(null IN [1, 2, 3]) will be null, too.

I think you can fix your query changing your test to:

AND NOT(w3 IS NULL OR w3 IN ws2Col)

That is: when w3 is null it is not considered an element of the list.