0
votes

Having a graph where molecules are attached to a common scaffold with [:substructureOf] relationships and where similar molecules are connected to eachothers with a [:isSimilarTo] relationship, is the a way to return all [:isSimilarTo] relationships for a specific subset of molecules?

In pseudo cypher code, considering m as a collection (subset of molecules), I would like to assess that m1 and m2 of each sim relationship is part of m

MATCH (:Scaffold {Name: 'A'}) <-[:usbstructureOf]- (m:Molecule)
WITH m MATCH (m1:Molecule) -[sim:isSimilarTo]- (m2:Molecule) 
WHERE m1 IN m AND m2 IN m

Is there a proper cypher way to write this? Example dataset below.

CREATE (:Scaffold {Name: 'A'});
CREATE (:Scaffold {Name: 'B'});
MATCH (s:Scaffold {Name: 'A'}) MERGE (s) -[:substructureOf]->(:Molecule {Name: 'm1'});
MATCH (s:Scaffold {Name: 'A'}) MERGE (s) -[:substructureOf]->(:Molecule {Name: 'm2'});
MATCH (s:Scaffold {Name: 'A'}) MERGE (s) -[:substructureOf]->(:Molecule {Name: 'm3'});
MATCH (s:Scaffold {Name: 'A'}) MERGE (s) -[:substructureOf]->(:Molecule {Name: 'm4'});
MATCH (s:Scaffold {Name: 'B'}) MERGE (s) -[:substructureOf]->(:Molecule {Name: 'm5'});
MATCH (m:Molecule {Name: 'm1'}), (n:Molecule {Name: 'm2'}) CREATE (m) -[isSimilarTo]-> (n);
MATCH (m:Molecule {Name: 'm1'}), (n:Molecule {Name: 'm3'}) CREATE (m) -[isSimilarTo]-> (n);
MATCH (m:Molecule {Name: 'm2'}), (n:Molecule {Name: 'm3'}) CREATE (m) -[isSimilarTo]-> (n);
MATCH (m:Molecule {Name: 'm3'}), (n:Molecule {Name: 'm4'}) CREATE (m) -[isSimilarTo]-> (n);
MATCH (m:Molecule {Name: 'm4'}), (n:Molecule {Name: 'm5'}) CREATE (m) -[isSimilarTo]-> (n);
2
I don't understand your question. Do you simply want all similar molecules for the ones you MATCH in the first query? More constraints? Explain.Martin Preusse
Actually, I want to retrieve all similarity relationships. In the given date, m1, m2, m3 and m4 share scaffold A. I'd like to know whether m1 is similar to m2, m3 and m4 as well as whether m2 is similar to m3 and m4, etc, but I without caring whether m1-m4 are similar to m5. The difficulty - to me - is to traverse m1-m4 twice. The idea of the pseudo code is to restrict on similarity edges in which start and end nodes are part of m1-m4 set. How to properly write this in cypher is my question.Pierre
Actually, the pseudo code could work if m was a collection and not juste nodes. WITH collect(m) MATCH...Pierre

2 Answers

1
votes

I still don't fully understand what you want, a network scheme/drawing might help next time.

But I think what you want is similar molecules sharing the same scaffold. I.e. all Molecule pairs connected by isSimilarTo edges, where both Molecules are linked to a defined Scaffold.

You can get this by matching the complete path:

(Scaffold)--(Molecule)--(similar Molecule)--(same Scaffold)

In Cypher:

MATCH (s:Scaffold {Name: 'A'})<-[:isbstructureOf]-(m1:Molecule)
       -[sim:isSimilarTo]-(m2:Molecule)-[:isbstructureOf]->(s)
// Return the relationships
RETURN DISTINCT m1.name, sim.value, m2.name
// Return count of relationships
RETURN count(DISTINCT sim)
0
votes

Alternative answer:

MATCH (s:Scaffold {Name: 'A'})-[:substructureOf*]->(m:Molecule) 
WITH collect(m) as mols MATCH p=(:Molecule) -[sim:isSimilarTo]- (:Molecule)    
WHERE startNode(r) IN mols and endNode(r) IN mols
RETURN p

The overall path is shorter.