0
votes

I have the following graph

enter image description here

I try to model the above graph in Neo4j such that for any duplicate node (say A), a property 'count' on the node is incremented to reflect the number of occurrences of A. Similarly, for any duplicate relationship (say A->B), a property 'frequency' is updated.

A Neo4j console for the graph is implemented here.

I modeled the graph in the above way keeping in mind that I could track the number of occurrences of each node and each individual transitions.

The next part of my requirement is to track all 3-nodes path and this is the query I issue, the output of which is visible in the Neo4j console -

MATCH (n)-[]->(m)-[]->(p) return n.name+' - '+m.name+' - '+p.name as NewName

However, the output I would like to have is -

A - B - C
B - D - A
D - A - B
B - D - E
E - B - C
D - E - B
A - B - D

But since the nodes and relationship are unique, one additional combination is also reported.

E - B - D

QUESTIONS

  1. What would I need to change in the graph setup/query so as to report only the 7 listed combinations instead of the 8 combinations?

  2. Is there a way to also count the frequency of such 3-nodes path?

I am fine creating multiple CYPHER scripts to achieve this. That being said, the CYPHER scripts are auto - generated as part of a bigger workflow and I would like to avoid manually typing the n-nodes path(s) and its frequency.

1
Very good question - I am not sure you can solve it using plain Cypher (not to mention having reasonable performance). Can you use Neo4j procedures (apoc)? - Gabor Szarnyas
A - B - D is a valid 3-node combination, so there are actually 7 valid combinations. - cybersam
@cybersam: You are right. A - B - D is also a valid combination. I have edited the question to reflect that as well. - GvanJoic

1 Answers

3
votes

You should change your data model. You do not need to calculate and update count and frequency properties; instead, they can be obtained directly from the appropriate data model.

For example, let's use an alternate data model to create your sample data. Every named node is represented once as a Foo node. Every usage of a Foo node is represented by a Bar node that references its Foo node via a FOR relationship. Bar nodes are linked together via NEXT relationships.

Create sample data

CREATE (a:Foo {name: 'A'}), (b:Foo {name: 'B'}), (c:Foo {name: 'C'}), (d:Foo {name: 'D'}), (e:Foo {name: 'E'})
CREATE (a1:Bar)-[:FOR]->(a), (b1:Bar)-[:FOR]->(b),
  (a1)-[:NEXT]->(b1)
CREATE (c1:Bar)-[:FOR]->(c),
  (b1)-[:NEXT]->(c1)
CREATE (d1:Bar)-[:FOR]->(d),
  (b1)-[:NEXT]->(d1)
CREATE (a2:Bar)-[:FOR]->(a),
  (d1)-[:NEXT]->(a2)
CREATE (b2:Bar)-[:FOR]->(b),
  (a2)-[:NEXT]->(b2)
CREATE (e1:Bar)-[:FOR]->(e),
  (d1)-[:NEXT]->(e1)
CREATE (b3:Bar)-[:FOR]->(b),
  (e1)-[:NEXT]->(b3)
CREATE (c2:Bar)-[:FOR]->(c),
  (b3)-[:NEXT]->(c2);

Get the number of times each Foo node is used:

MATCH (f:Foo)<--() RETURN f, COUNT(*) AS count;

Get the number of times each pair of Foo nodes are used in sequence:

MATCH (f1:Foo)<-[:FOR]-()-[:NEXT]->()-[:FOR]->(f2:Foo)
RETURN f1, f2, COUNT(*) AS count;

Get the number of times each triplet of Foo nodes are used in sequence:

MATCH (f1:Foo)<-[:FOR]-()-[:NEXT]->(b1)-[:FOR]->(f2:Foo), (b1)-[:NEXT]->(b2)-[:FOR]->(f3:Foo)
RETURN f1, f2, f3, COUNT(*) as count;

Here is the result of the last query, which shows all 7 valid triplets and the number of times they occured:

+----------------------------------------------------------------------+
| f1                | f2                | f3                | count    |
+----------------------------------------------------------------------+
| Node[3]{name:"D"} | Node[4]{name:"E"} | Node[1]{name:"B"} | 1        |
| Node[1]{name:"B"} | Node[3]{name:"D"} | Node[4]{name:"E"} | 1        |
| Node[4]{name:"E"} | Node[1]{name:"B"} | Node[2]{name:"C"} | 1        |
| Node[3]{name:"D"} | Node[0]{name:"A"} | Node[1]{name:"B"} | 1        |
| Node[1]{name:"B"} | Node[3]{name:"D"} | Node[0]{name:"A"} | 1        |
| Node[0]{name:"A"} | Node[1]{name:"B"} | Node[3]{name:"D"} | 1        |
| Node[0]{name:"A"} | Node[1]{name:"B"} | Node[2]{name:"C"} | 1        |
+----------------------------------------------------------------------+