1
votes

If I have a graph as shown below. I would like to find the maximum value in a subgraph and minimum value in a subgraph take the difference and return.

For instance the right-most subgraph has 4 nodes. Maximum value is 3 and Minimum value is 1, I would like to take the difference and return, which for this case is 2. This should happen for every disconnected subgraph in the whole graph database. I will prefer to handle each subgraph using one query, that way it can be done in batch and difference for each subgraph can be returned.

I will be thankful to get some intuition.

1

1 Answers

1
votes

The real problem will be finding those subgraphs, as Neo4j has no native support for disconnected subgraph detection or tracking, and will require some intensive full graph queries to identify them.

I've provided an approach to finding disconnected subgraphs and attaching a :Subgraph node to the node with the smallest id in the subgraph in this answer to a similar question.

Once the :Subgraph nodes are in place, you are free to batch queries on the subgraphs.

As noted in that answer, it does not provide an approach to keeping up with graph changes which end up affecting subgraphs (creating new subgraphs, merging subgraphs, dividing subgraphs).

EDIT

Once you have a :Subgraph node attached to each disconnected subgraph, you can perform operations on subgraphs easily.

You might use this query to calculate the difference:

MATCH (s:Subgraph)-[*]-(subgraphNode)
WITH DISTINCT s, subgraphNode
WITH s, MIN(subgraphNode.value) as minimum, MAX(subgraphNode.value) as maximum
WITH s, maximum - minimum as difference
...

If you need to batch that query, then you'll want to use APOC Procedures, probably apoc.periodic.iterate().

EDIT

After some testing, it seems like APOC's Path Expander functionality, using NODE_GLOBAL uniqueness, leads to a more efficient means to find all nodes within a subgraph.

I'll be altering my linked answer accordingly. Here's how this would work with the subgraph query:

MATCH (s:Subgraph)
CALL apoc.path.expandConfig(s,{minLevel:1, bfs:true, uniqueness:"NODE_GLOBAL"}) YIELD path
WITH s, last(nodes(path)) as subgraphNode
WITH s, MIN(subgraphNode.value) as minimum, MAX(subgraphNode.value) as maximum
WITH s, maximum - minimum as difference
...