I have a neo4j database schema that looks like:
(a:Author)<-[r:HAS_AUTHOR]-(n:Article)-[rel:HAS_DESCRIPTOR]->(d:Descriptor)
I'd like to do a query showing the link between authors and descriptors, filtered for authors that have published more than once (count(r)>1) and for descriptors that occurred in more than one article (count(rel)>1)
Here is the query that I wrote:
MATCH (a:Author)<-[r:HAS_AUTHOR]-(n:Article)-[rel:HAS_DESCRIPTOR]->(d:Descriptor)
WITH a,count(r) as cnt WHERE cnt>1
MATCH (a:Author)<-[r:HAS_AUTHOR]-(n:Article)-[rel:HAS_DESCRIPTOR]->(d:Descriptor)
WITH d,count(rel) as cnt1 WHERE cnt1>1
MATCH (a:Author)<-[r:HAS_AUTHOR]-(n:Article)-[rel:HAS_DESCRIPTOR]->(d:Descriptor)
RETURN * limit 100
It doesn't seem to do what I'm expecting. I'm still seeing Authors or Descriptors linked to a single article.
Note that the count of relationships should be considered only in the context of the query (ie.: with limit 100, all authors should be linked to more than one article in the query output graph).
Is that the right way to write this query? Thanks
EDIT
I apologize for not being clear enough.
If I run a simple query showing all author--article--descriptor graphs, I can have some of the scenario in images below.
In all images, yellow nodes are articles, green are authors and pink are descriptors.
Scenario 1: An article that is the only one mentioning the descriptor. I'd like to filter out those descriptors that are mentioned in only one article.
Scenario 2: A descriptor mentioned by more than one article but whose authors have not published any other articles. I'd like to filter out those authors that have published only one article
These two filters should apply at the sub-graph level. For example: if I filter down to a particular descriptor type, then the two conditions (author and descriptor with more than one article) should be fulfilled in this new sub-graph.
The first query that was proposed generate graphs as in the image below:
MATCH (a:Author)
WHERE size((a)<-[:HAS_AUTHOR]-()) > 1
MATCH (a)<-[:HAS_AUTHOR]-(n:Article)-[:HAS_DESCRIPTOR]->(d:Descriptor)
WITH a, d, collect(n) as articles
WHERE size(articles) > 1
RETURN a, d, articles
The collect(n) as articles for a,d pair forces the author to have published twice on the same descriptor which is not desirable. I'd like to allow for an author who has published papers on 2 different descriptors to appear.
The second query that was proposed generate graphs as in the image below:
MATCH (d:Descriptor)
WHERE size((d)<-[:HAS_DESCRIPTOR]-()) > 1
WITH collect(d) as descriptors
MATCH (a:Author)
WHERE size((a)<-[:HAS_AUTHOR]-()) > 1
MATCH (a)<-[:HAS_AUTHOR]-(n:Article)-[:HAS_DESCRIPTOR]->(d)
WHERE d in descriptors
RETURN a, n, d
Note that I added a filter on descriptor type so that the query could run and I'm not sure if that would impact the filtering condition. Here it shows descriptors and author linked to a single article.
for descriptors that occurred in more than one article (count(rel)>1)
, do you mean across all articles, or for more than one article considering articles per author? - InverseFalcon