I might be overcomplicating this, but given some sample data that looks close to your diagram:
MERGE (sd: Node { name: 'SD' })
MERGE (ssd: Node { name: 'SSD' })
MERGE (ld: Node { name : 'LD' })
MERGE (ceo: Node { name: 'CEO' })
MERGE (cto: Node { name: 'CTO' })
MERGE (st: Node { name: 'ST' })
MERGE (vf: Node { name: 'VF' })
MERGE (bo: Node { name: 'BO' })
MERGE (md: Node { name: 'MD' })
MERGE (bb: Node { name: 'BB' })
MERGE (sd)-[:NEXTEXP { id: 1 }]->(ssd)
MERGE (ssd)-[:NEXTEXP { id: 1 }]->(ld)
MERGE (ld)-[:NEXTEXP { id: 1 }]->(ceo)
MERGE (sd)-[:NEXTEXP { id: 2 }]->(ssd)
MERGE (ssd)-[:NEXTEXP { id: 2 }]->(ld)
MERGE (ssd)-[:NEXTEXP { id: 2 }]->(vf)
MERGE (ssd)-[:NEXTEXP { id: 2 }] ->(bo)
MERGE (sd)-[:NEXTEXP { id: 3 }]->(st)
MERGE (st)-[:NEXTEXP { id: 3 }]->(bo)
MERGE (bo)-[:NEXTEXP { id: 3 }]->(cto)
MERGE (sd)-[:NEXTEXP { id: 4 }]->(st)
MERGE (st)-[:NEXTEXP { id: 4 }]->(md)
MERGE (st)-[:NEXTEXP { id: 5 }]->(bb)
The following appears to yield the right sort of answer:
MATCH path=(s: Node { name: 'SD' })-[:NEXTEXP*1..2]->(other: Node)
WITH distinct nodes(path) as pathNodes,
relationships(path) as pathRels,
reduce(minValue = head(relationships(path)).id, x in relationships(path) | (case when x.id = minValue then x.id else null end)) as pathRelIds
WHERE pathRelIds IS NOT NULL
WITH pathNodes, count(pathNodes) as ct, size(pathNodes) as pathLength
RETURN pathLength as `Length`, pathNodes as `Paths`, ct as `Count`
ORDER BY `Length`, `Paths`, `Count`
The query assumes that you need the ID of the relationships to be the same throughout the path. This seemed to be what you're asking since in your 'expected' table SD -> SSD -> LD has a count of 2, rather than 3.
Breaking it down a bit:
- We match for between 1 and 2 relationships and assign the resulting paths to a variable called
path
- The
reduce
call in the second line tries to return either the smallest relationship ID within the path, or NULL
if there are multiple relationship IDs - it's a bit of a cheat that relies on how Cypher treats NULL
in expressions
- The third line eliminates those cases where the previous
reduce
returned NULL
- that is, we eliminate paths where there's more than one ID on the relationships in the path
For the exact output you're asking for (where the names of nodes are joined into a single string), the following amended query:
MATCH path=(s: Node { name: 'SD' })-[:NEXTEXP*1..2]->(other: Node)
WITH distinct s,
nodes(path) as pathNodes,
relationships(path) as pathRels,
reduce(minValue = head(relationships(path)).id, x in relationships(path) | (case when x.id = minValue then x.id else null end)) as pathRelIds
WHERE pathRelIds IS NOT NULL
WITH s, pathNodes, count(pathNodes) as ct, size(pathNodes) as pathLength
RETURN pathLength as `Length`,
reduce(pathStr = s.name, x in pathNodes[1..] | pathStr + ' --> ' + x.name) as `Paths`,
ct as `Count`
ORDER BY `Length`, `Count` DESC, `Paths`
Yields the following output:
╒════════╤═══════════════════╤═══════╕
│"Length"│"Paths" │"Count"│
╞════════╪═══════════════════╪═══════╡
│2 │"SD --> SSD" │2 │
├────────┼───────────────────┼───────┤
│2 │"SD --> ST" │2 │
├────────┼───────────────────┼───────┤
│3 │"SD --> SSD --> LD"│2 │
├────────┼───────────────────┼───────┤
│3 │"SD --> SSD --> BO"│1 │
├────────┼───────────────────┼───────┤
│3 │"SD --> SSD --> VF"│1 │
├────────┼───────────────────┼───────┤
│3 │"SD --> ST --> BO" │1 │
├────────┼───────────────────┼───────┤
│3 │"SD --> ST --> MD" │1 │
└────────┴───────────────────┴───────┘
Which differs from your expected only by the SD --> SSD --> BO row, which is probably due to me misinterpreting your diagram.
MATCH path = (start)-[*..3]-() UNWIND rels(path) as rel WITH path, rel WHERE type(rel) = 'NextExp' RETURN path, count(*), length(path)
– Michael Hunger