0
votes

I have a complex graph as shown in image below :

enter image description here

Here every relationship has a type value. I need to write a cypher query to find all relationships (with their type values) among given set of nodes (two or more). The nodes can be entered in any order, like x64->Linux->Oracle or Oracle->Linux->10.2.

EDIT

I am expecting output like this. All the combination of nodes with relationship name that links them.

  1. For input : x64->Linux->Oracle

enter image description here

  1. For input : Linux->64->Oracle->12c

enter image description here

DATA

Data can be accessed from https://www.dropbox.com/s/w28omkdrgmhv7ud/Neo4j%20Queries.txt?dl=0

EDIT New Output format for input x64->Linux->Oracle

enter image description here

2
Can you post the resulting graph of CALL apoc.meta.graph(); please? It renders your node and label structure, which helps us to provide the answer as concrete as possible.ThirstForKnowledge

2 Answers

1
votes

Provided you're looking for only relationships directly connecting each pair of nodes in the set (as opposed to finding all multi-hop paths between pairs of nodes in the set), APOC Procedures has apoc.algo.cover() for exactly this use case:

...
// assume `nodes` is the collection of nodes
CALL apoc.algo.cover(nodes) YIELD rel
RETURN rel

EDIT

As mentioned in my comment, your change to the requirements drastically changes the nature of the question.

You seem to want complete path results (directed), including nodes that are not in your input, and you want to ensure that the same type attribute is present for all relationships in the path.

This requires that we find the ordering of those nodes so we can identify a path between all of them. While we could find all possible permutations of the input nodes (for the order of traversal for the paths), I think we can get away with just finding permutations of 2 for the start and end nodes (by UNWINDING the collection twice and removing rows where the start and end node are the same). We'll want to first find all the input and output relationship types so we can use some set operations (output types of the start node intersected with input types of the end node intersected with all the (intersected) input and output types of the other nodes) to find the potential types that can possibly be present on relationships that can connect all the nodes.

From the remaining rows after this filtering we can match to the variable-length path that can connect all of those nodes, using only the types provided so that each path only traverses traverses relationships with a single type. Afterwards we filter to ensure that all the input nodes are in the path.

We'll assume the nodes are of type :Node with the property 'name'.

MATCH (entity:Entity) 
WHERE entity.key in ['Product','Version','BinaryType'] AND entity.value in ['pc','10.2','64']
WITH collect(entity) as nodes
UNWIND nodes as node
WITH nodes, node, [()-[r]->(node) | r.type] as inputTypes, [(node)-[r]->() | r.type] as outputTypes
WITH nodes, node, apoc.coll.toSet(inputTypes) as inputTypes, apoc.coll.toSet(outputTypes) as outputTypes
WITH nodes, collect({node:node, inputTypes:inputTypes, outputTypes:outputTypes}) as nodeData
UNWIND nodeData as start
UNWIND nodeData as end
WITH nodes, start, end, nodeData
WHERE start <> end
WITH nodes, start, end, apoc.coll.subtract(nodeData, [start, end]) as theRest
WITH nodes, start.node as start, end.node as end, apoc.coll.intersection(start.outputTypes, end.inputTypes) as possibleTypes, [data in theRest | apoc.coll.intersection(data.inputTypes, data.outputTypes)] as otherTypes
WITH nodes, start, end, reduce(possibleTypes = possibleTypes, types in otherTypes | apoc.coll.intersection(possibleTypes, types)) as possibleTypes
WHERE size(possibleTypes) > 0
UNWIND possibleTypes as type
MATCH path = (start)-[*]->(end)
WHERE all(rel in relationships(path) WHERE rel.type = type) 
 AND length(path) >= size(nodes) - 1 
 AND all(node in nodes WHERE node in nodes(path))
RETURN nodes(path) as pathNodes, type

To do this with both type and level, we need to collect both of them earlier in the query, so instead of dealing with just a type, we're dealing with a map of both the type and the level. This does make the query a bit more complex, but it's necessary to ensure that the paths provided have the same type and level for all relationships in the path.

MATCH (entity:Entity) 
WHERE entity.key in ['Product','Version','BinaryType'] AND entity.value in ['pc','10.2','64']
WITH collect(entity) as nodes
UNWIND nodes as node
WITH nodes, node, [()-[r]->(node) | {type:r.type, level:r.level}] as inputs, [(node)-[r]->() | {type:r.type, level:r.level}] as outputs
WITH nodes, collect({node:node, inputs:apoc.coll.toSet(inputs), outputs:apoc.coll.toSet(outputs)}) as nodeData
UNWIND nodeData as start
UNWIND nodeData as end
WITH nodes, start, end, nodeData
WHERE start <> end
WITH nodes, start, end, apoc.coll.subtract(nodeData, [start, end]) as theRest
WITH nodes, start.node as start, end.node as end, apoc.coll.intersection(start.outputs, end.inputs) as possibles, [data in theRest | apoc.coll.intersection(data.inputs, data.outputs)] as others
WITH nodes, start, end, reduce(possibles = possibles, data in others | apoc.coll.intersection(possibles, data)) as possibles
WHERE size(possibles) > 0
UNWIND possibles as typeAndLevel
MATCH path = (start)-[*]->(end)
WHERE all(rel in relationships(path) WHERE rel.type = typeAndLevel.type AND rel.level = typeAndLevel.level) 
 AND length(path) >= size(nodes) - 1 
 AND all(node in nodes WHERE node in nodes(path))
RETURN nodes(path) as pathNodes, typeAndLevel.type as type, typeAndLevel.level as level
1
votes

Remarks

Before I render the solution and result, I would like to suggest a revision of your model.

  • If not already done, introduce different node types in the form of labels (Architecture, Software, SoftwareVersion, etc.) which will make your data retrieval much easier.
  • Depending on the number of source databases domain_database_n and the parallel SUPPORTS relationships resulting from it, the use of multiple labels for a node could be clearer and more performant. A bigger amount of relationships would be superfluous in this case.
  • Neo4j doesn't have a focus on searching and filtering of relationship properties. Modeling those attributes as node or property of a node is significantly more performant, especially for large graphs.
  • Consider the Neo4j naming conventions.

Your graph / initial situation

For the ease of possible further answers and solutions I note my graph creating statement:

CREATE
  (pc:UntypedNode {name: 'PC'})-[:SUPPORTS {type: 'domain_database_1'}]->(tenDotTwo:UntypedNode {name:'10.2'}),
  (pc)-[:SUPPORTS {type: 'domain_database_2'}]->(tenDotTwo),
  (pc)-[:SUPPORTS {type: 'domain_database_3'}]->(tenDotTwo),
  (pc)-[:SUPPORTS {type: 'domain_database_4'}]->(tenDotTwo),
  (pc)-[:SUPPORTS {type: 'domain_database_5'}]->(tenDotTwo),
  (pc)-[:SUPPORTS {type: 'domain_database_6'}]->(tenDotTwo),
  (pc)-[:SUPPORTS {type: 'domain_database_7'}]->(tenDotTwo),
  (tenDotTwo)-[:SUPPORTS {type: 'domain_database_1'}]->(linux:UntypedNode {name:'Linux'}),
  (tenDotTwo)-[:SUPPORTS {type: 'domain_database_2'}]->(linux),
  (tenDotTwo)-[:SUPPORTS {type: 'domain_database_3'}]->(linux),
  (tenDotTwo)-[:SUPPORTS {type: 'domain_database_4'}]->(linux),
  (tenDotTwo)-[:SUPPORTS {type: 'domain_database_5'}]->(linux),
  (tenDotTwo)-[:SUPPORTS {type: 'domain_database_6'}]->(linux),
  (tenDotTwo)-[:SUPPORTS {type: 'domain_database_7'}]->(linux),
  (linux)-[:SUPPORTS {type: 'domain_database_1'}]->(sevenDotZero:UntypedNode {name:'7.0'}),
  (linux)-[:SUPPORTS {type: 'domain_database_2'}]->(sevenDotZero),
  (linux)-[:SUPPORTS {type: 'domain_database_3'}]->(sevenDotZero),
  (linux)-[:SUPPORTS {type: 'domain_database_4'}]->(sevenDotZero),
  (linux)-[:SUPPORTS {type: 'domain_database_5'}]->(sevenDotZero),
  (linux)-[:SUPPORTS {type: 'domain_database_6'}]->(sevenDotZero),
  (linux)-[:SUPPORTS {type: 'domain_database_7'}]->(sevenDotZero),
  (sevenDotZero)-[:SUPPORTS {type: 'domain_database_1'}]->(x64:UntypedNode {name:'x64'}),
  (sevenDotZero)-[:SUPPORTS {type: 'domain_database_2'}]->(x64),
  (sevenDotZero)-[:SUPPORTS {type: 'domain_database_3'}]->(x64),
  (sevenDotZero)-[:SUPPORTS {type: 'domain_database_4'}]->(x64),
  (sevenDotZero)-[:SUPPORTS {type: 'domain_database_5'}]->(x64),
  (sevenDotZero)-[:SUPPORTS {type: 'domain_database_6'}]->(x64),
  (sevenDotZero)-[:SUPPORTS {type: 'domain_database_7'}]->(x64),
  (x64)-[:SUPPORTS {type: 'domain_database_1'}]->(sixtyFour:UntypedNode {name:'64'}),
  (x64)-[:SUPPORTS {type: 'domain_database_2'}]->(sixtyFour),
  (x64)-[:SUPPORTS {type: 'domain_database_3'}]->(sixtyFour),
  (x64)-[:SUPPORTS {type: 'domain_database_4'}]->(sixtyFour),
  (x64)-[:SUPPORTS {type: 'domain_database_5'}]->(sixtyFour),
  (x64)-[:SUPPORTS {type: 'domain_database_6'}]->(sixtyFour),
  (x64)-[:SUPPORTS {type: 'domain_database_7'}]->(sixtyFour),
  (sixtyFour)-[:SUPPORTS {type: 'domain_database_1'}]->(sqlServer:UntypedNode {name:'SQL Server'}),
  (sixtyFour)-[:SUPPORTS {type: 'domain_database_2'}]->(sqlServer),
  (sixtyFour)-[:SUPPORTS {type: 'domain_database_3'}]->(sqlServer),
  (sqlServer)-[:SUPPORTS {type: 'domain_database_1'}]->(year2014:UntypedNode {name:'2014'}),
  (sqlServer)-[:SUPPORTS {type: 'domain_database_2'}]->(year2016:UntypedNode {name:'2016'}),
  (sqlServer)-[:SUPPORTS {type: 'domain_database_3'}]->(year2017:UntypedNode {name:'2017'}),
  (year2014)-[:SUPPORTS {type: 'domain_database_1'}]->(s:UntypedNode {name:'S'}),
  (year2016)-[:SUPPORTS {type: 'domain_database_2'}]->(s),
  (year2017)-[:SUPPORTS {type: 'domain_database_3'}]->(s),
  (sixtyFour)-[:SUPPORTS {type: 'domain_database_4'}]->(oracle:UntypedNode {name:'Oracle'}),
  (sixtyFour)-[:SUPPORTS {type: 'domain_database_5'}]->(oracle),
  (sixtyFour)-[:SUPPORTS {type: 'domain_database_6'}]->(oracle),
  (sixtyFour)-[:SUPPORTS {type: 'domain_database_7'}]->(oracle),
  (oracle)-[:SUPPORTS {type: 'domain_database_4'}]->(release12c:UntypedNode {name:'12c'}),
  (oracle)-[:SUPPORTS {type: 'domain_database_5'}]->(release12gr2:UntypedNode {name:'12gR2'}),
  (oracle)-[:SUPPORTS {type: 'domain_database_6'}]->(release12cr:UntypedNode {name:'12cR'}),
  (oracle)-[:SUPPORTS {type: 'domain_database_7'}]->(release12cr1:UntypedNode {name:'12cR1'}),
  (release12c)-[:SUPPORTS {type: 'domain_database_4'}]->(s),
  (release12gr2)-[:SUPPORTS {type: 'domain_database_5'}]->(s),
  (release12cr)-[:SUPPORTS {type: 'domain_database_6'}]->(s),
  (release12cr1)-[:SUPPORTS {type: 'domain_database_7'}]->(s);

Solution

MATCH (n:UntypedNode)
  WHERE n.name IN $names
WITH collect(n) AS nodes
UNWIND nodes AS n
UNWIND nodes AS m
WITH *
  WHERE id(n) < id(m)
MATCH path = allShortestPaths((n)-[relation*..10]-(m))
  WHERE ALL(x IN relation
    WHERE x.type = $relationshipName)
WITH
  collect({ path: path, pathLength: length(path) }) AS data,
  max(length(path)) AS maxLength
WITH [row IN data WHERE row.pathLength = maxLength] AS rows
UNWIND rows AS row
RETURN row.path AS path;

with parameters:

  • "nodeNames": ['Oracle', 'Linux', '10.2']
  • "relationshipName": 'domain_database_4'

Explanation:

  • line 1-2: identification of the given starting nodes
  • line 3-5: create two sets of individual rows for the starting nodes
  • line 6-7: take only one direction of each relationship and exclude node self references
  • line 8: compute all shortest paths between both sets of starting node rows until a length of 10 relationships
  • line 9-10: filter all resulting paths regarding relationships with specified type (parameter relationshipName)
  • line 11-15: filter for the longest „shortest path“, omitting part paths
  • line 16: render the resulting path(s)

Result

╒═══════════════════════════════════════════════════════════╕
│"path"                                                     │
╞═══════════════════════════════════════════════════════════╡
│[{"name":"10.2"},{"type":"domain_database_4"},{"name":"Linu│
│x"},{"name":"Linux"},{"type":"domain_database_4"},{"name":"│
│7.0"},{"name":"7.0"},{"type":"domain_database_4"},{"name":"│
│x64"},{"name":"x64"},{"type":"domain_database_4"},{"name":"│
│64"},{"name":"64"},{"type":"domain_database_4"},{"name":"Or│
│acle"}]                                                    │
└───────────────────────────────────────────────────────────┘

graph