1
votes

I am trying to query a graph to return all of the paths with a specified relationship.

I am building a family tree and I have the following nodes:

  1. Person
  2. Edge

The relationships which connect two people are:

  1. PARENT
  2. CHILD
  3. COUPLE

So an example of some data are:

  1. a:Person<-[:CHILD]-(Edge)-[:PARENT]->b:Person
  2. a:Person<-[:COUPLE]-(r2:Edge)-[:COUPLE]->c:Person<-[:PARENT]-(r3:Edge)-[:CHILD]->d:Person

In my query I would like to be able to draw the graph in my front end so I need to get every Person and Edge Node, and every relationship between them, that is connected to my starting person.

I tried originally to use allShortestPaths to reduce the duplication of nodes but this misses a lot of relationships (e.g. if parents have multiple children then only either the mother or father relationship to a child will be returned and not both). My current cypher is as follows (I distinguish between Edge and Person nodes by filtering on the type property which only exists on Relationships):

START person = node({personId})
MATCH person-[:PARENT | CHILD | COUPLE*]-(relatedPeople:Person)
WITH relatedPeople, person
MATCH p = allShortestPaths(person-[:PARENT|CHILD|COUPLE]-(relatedPeople))
WITH [n in nodes(p) WHERE not(has(n.type))] AS people, last([n in nodes(p) WHERE has(n.type) | id(n)]) AS relationship, [r in rels(p) | TYPE(r)] AS rels
WITH relationship, rels, last(people) AS destination, [p in people | id(p)] AS source
MATCH destination-[:NAME]->(name), destination-[:GENDER]->gender
RETURN source, id(destination), relationship, rels, collect(name), gender

I thought this was working but realised that the shortest path drops some relationships. I was thinking of just finding all related nodes and all of the relationships and then processing the results to construct the correct format but I feel there is probably an easier /more efficient way.

UPDATE 06/01/2014

I have tried returning all paths but the model can have loops and therefore the number of paths returned increases exponentially with size. The cypher below creates a sample data set.

CREATE (p1:Person)<-[:CHILD]-(r1:Edge { type:'parentChild' })-[:PARENT]->(p2:Person)
<-[:COUPLE]-(r2:Edge { type:'couple' })-[:COUPLE]->(p3:Person)<-[:PARENT]-
(r3:Edge { type:'parentChild' })-[:CHILD]->p1<-[:COUPLE]-(r4:Edge { type:'couple' })-[:COUPLE]->
(p4:Person)<-[:CHILD]-(r5:Edge { type:'parentChild' })-[:PARENT]->(p5:Person)<-[:COUPLE]-
(r6:Edge { type:'couple' })-[:COUPLE]->(p6:Person)<-[:PARENT]-(r7:Edge { type:'parentChild' })-
[:CHILD]->p4, 
p5<-[:PARENT]-(r8:Edge { type:'parentChild' })-[:CHILD]->(p7:Person),
p5<-[:PARENT]-(r9:Edge { type:'parentChild' })-[:CHILD]->(p8:Person), 
p5<-[:PARENT]-(r10:Edge { type:'parentChild' })-[:CHILD]->(p9:Person)

The image shows the view of the data. Sample Data

The desired result is a row for every Node-Edge-Node connection. There are 10 in total in this data but I get 128 rows when I return all paths due to the loops. Is there a more efficient method of filtering the paths?

1
What happens if you just do the path matches without the shortest path? Can you limit the duplication with aggregation or distinct?Michael Hunger
Michael - thanks for the response but I can't figure out how to reduce the number of returned paths. I have added more detail to hopefully explain the problem a bit more. Thanks for your help.rorymadden
Have you considered using the transactional rest resource and requesting "resultDataContens":"graph"? I don't know what you need back to visualize your data properly, but you could try matching your pattern, extracting and collecting all distinct relationships as one result row, request it as 'graph' per docs.neo4j.org/chunked/milestone/… and the result should be non-redundant and easily parseable for visualization.jjaderberg

1 Answers

0
votes

In order to resolve this I used the following cypher.

START person = node({personId})' +
MATCH person-[:PARENT | CHILD | COUPLE*0..]-(p:Person)
WITH distinct p
MATCH p-[r]-(edge:Edge), p-[:NAME]->(name), p-[:GENDER]->gender, edge-[:FACT]->(fact)
RETURN p, id(p), collect(name), gender, type(r), id(edge), collect(fact.type)
ORDER BY id(edge)

I used to *0.. in the relationship to match all of the "Person" nodes who are connected, including the original node. After removing the duplicates it was simply a case of finding all of the relationships between the nodes and the extra information on each point.

This cypher returned a row for every "person" to "edge" relationships. There is some duplication in "person" information but this was easily parsed out once the data was returned.

Thanks for the comments and help.