Cypher: List declared before UNWIND-ing a second list becomes null after UNWIND-ing the second list and executing a MATCH which returns no results

Question

I have the following scenario in a noe4j db: There are tasks which can be assigned to different users based on some criteria. There's an optional criterion (some tasks have a filter for user's location, some don't). I need to find all tasks for a user (if they have a location filter, I need to check user's location as well, if they don't I match only by the rest of the criteria).

I've tried to collect the tasks matching the mandatory criteria, then filter those which don't require the optional filter, then filter those which require the optional filter and match the current user and eventually merge the two lists.

Could you also suggest a more efficient way to do this please?

Here's a minimal example (of course, I have more complex matches after UNWIND)

WITH [{a: 'test'}, {a: 'a', b: 'b'}] AS initialList

WITH [i IN initialList WHERE i.b IS NULL] AS itemsWithoutB, initialList

UNWIND initialList AS item
MATCH (item) WHERE item.a IS NULL

RETURN COLLECT(item) + itemsWithoutB

I would expect here to have the content of itemsWithoutB returned, but I get no records (Response: []).

Note that if the MATCH done after UNWIND does actually return some records, then the content of itemsWithoutB is returned as well.

For example:

WITH [{a: 'test'}, {a: 'a', b: 'b'}] AS initialList

WITH [i IN initialList WHERE i.b IS NULL] AS itemsWithoutB, initialList

UNWIND initialList AS item
MATCH (item) WHERE item.a IS NOT NULL

RETURN COLLECT(item) + itemsWithoutB

this returns:

╒═════════════════════════════════════════════╕
│"COLLECT(item) + itemsWithoutB"              │
╞═════════════════════════════════════════════╡
│[{"a":"test"},{"a":"a","b":"b"},{"a":"test"}]│
└─────────────────────────────────────────────┘

Neo4j version: enterprise 3.5.6

What am I missing here, please?

---EDITED---

I'm adding here a more complex example, closer to the real scenario:

Generate initial data:

MERGE (d:Device {code: 'device1', latitude:90.5, longitude: 90.5})-[:USED_BY]->(u:User {name: 'user1'})-[:WORKS_IN]->(c:Country {code: 'RO'})<-[targets:TARGETS]-(:Task {name: 'task1', latitude: 90.5, longitude: 90.5, maxDistance: 1000, maxMinutesAfterLastInRange: 99999})<-[:IN_RANGE {timestamp: datetime()}]-(d)

MERGE (c)<-[:TARGETS]-(:Task {name: 'task2'})

MERGE (c)<-[:TARGETS]-(:Task {name: 'task4', latitude: 10.5, longitude: 10.5, maxDistance: 1, maxMinutesAfterLastInRange: 99999})

CREATE (:User  {name: 'user2'})-[:WORKS_IN]->(:Country {code: 'GB'})<-[:TARGETS]-(:Task {name: 'task3'})

Here's a neo4j console link for this example.

I want to be able to use the same query to find the tasks for any user (task1 and task2 should be returned for user1, task3 for user2, task4 shouldn't be returned for neither of them).

The following query works for user1, but doesn't work if I change the user name filter to "user2":

MATCH (user:User {name: "user1"})-[:WORKS_IN]->(country) 
OPTIONAL MATCH (device:Device)-[:USED_BY]->(user)
WITH country, device
MATCH (task:Task)-[:TARGETS]->(country) 
WITH COLLECT(task) AS filteredTasks, device

WITH [t IN filteredTasks WHERE t.latitude IS NULL OR t.longitude IS NULL] AS matchedTasksWithoutLocationFilter, filteredTasks, device

UNWIND filteredTasks AS task
MATCH (device)-[inRange:IN_RANGE]->(task)
WHERE task.maxMinutesAfterLastInRange IS NOT NULL 
AND duration.between(datetime(inRange.timestamp), datetime()).minutes <= task.maxMinutesAfterLastInRange

RETURN COLLECT(task) + matchedTasksWithoutLocationFilter AS matchedTasks

Pablissimo Pablissimo · Accepted Answer · 2019-09-05T15:03:00

Updated answer based on new information

I think you can do this in one shot and not need list comprehensions.

   MATCH (user: User {name: "user1" })-[:WORKS_IN]->(country)<-[:TARGETS]-(task: Task)
OPTIONAL MATCH (task)<-[inRange: IN_RANGE]-(device: Device)-[:USED_BY]->(user)
    WITH task, inRange
   MATCH (task)
   WHERE (task.latitude IS NULL OR task.longitude IS NULL)
      OR (inRange IS NOT NULL AND 
          task.maxMinutesAfterLastInRange IS NOT NULL AND 
          duration.between(datetime(inRange.timestamp), datetime()).minutes <= task.maxMinutesAfterLastInRange)
  RETURN task

For user1:

╒══════════════════════════════════════════════════════════════════════╕
│"task"                                                                │
╞══════════════════════════════════════════════════════════════════════╡
│{"name":"task2"}                                                      │
├──────────────────────────────────────────────────────────────────────┤
│{"name":"task1","maxDistance":1000,"maxMinutesAfterLastInRange":99999,│
│"latitude":90.5,"longitude":90.5}                                     │
└──────────────────────────────────────────────────────────────────────┘

For user2:

╒════════════════╕
│"task"          │
╞════════════════╡
│{"name":"task3"}│
└────────────────┘

Original answer

When your MATCH doesn't return any nodes (in that example, all nodes have an a property), the rest of the query's got no work to do - sort of like a failed inner join in a traditional SQL database.

If you switch to an OPTIONAL MATCH then you'll see results from itemsWithoutB irrespective of whether the MATCH worked. I know your example's synthetic so I'm not sure if that's what you're after - in your example the COLLECT(item) is going to be working off the item from UNWIND, and the result of the OPTIONAL MATCH is basically irrelevant. Still, imagine that these are real nodes with real queries:

WITH [{a: 'test'}, {a: 'a', b: 'b'}] AS initialList
WITH [i IN initialList WHERE i.b IS NULL] AS itemsWithoutB, initialList
UNWIND initialList AS item
OPTIONAL MATCH (item) WHERE item.a IS NULL
RETURN COLLECT(item) + itemsWithoutB

You may need to do some further work to de-duplicate the results.

Cypher: List declared before UNWIND-ing a second list becomes null after UNWIND-ing the second list and executing a MATCH which returns no results

1 Answers

Updated answer based on new information

Original answer