2
votes

When I search for nodes with a certain zipcode:

MATCH (z:ZipCode) WHERE z.zipcode = "2014 AAE" RETURN z.zipcode

I get duplicates:

z.zipcode
2014 AAE
2014 AAE

When I search for relations of a certain zipcode:

MATCH p=(z:ZipCode)-->() WHERE z.zipcode = "2014 AAE" RETURN p

I get a single zipcode node 2014 AAE pointing to a house node 518Q

How can I merge the zipcode nodes with the same property value, but leave all the relations intact of the zipcode?

Edit:

After cybersam's answer I constructed a query. Is this the way to combine the nodes with APOC?

MATCH (z1:ZipCode)-->(), (z2:ZipCode)-->()
WHERE z1.zipcode = z2.zipcode
AND ID(z1) <> ID(z2)
WITH COLLECT([z1,z2]) AS zs
CALL apoc.refactor.mergeNodes(zs) YIELD node
RETURN node;

I get this as error:

Type mismatch: expected Collection<Node> but was Collection<Collection<Node>> (line 5, column 31 (offset: 160))
"CALL apoc.refactor.mergeNodes(zs) YIELD node"
1
I think it is not possible to get 2 results from first query and 1 from second. First query says clearly that there are 2 nodes with particular ZipCode. Is it possible that there is also some node p with zipcode 2014 AAE which has "IN-relation"? I mean (z:ZipCode)<--(). Try to rewrite your second query to check it. But It is long time that I played with neo4j so maybe my thoughts are wrong.Gondil
If you want to remove duplicates you just have to match all nodes (matching 2 nodes e.g. m and n), compare them so you know they are same (have same properties m.zipcode = n.zipcode) but they are not same one node m<>n, than you have to find all relationships of n which are not in m, create these "n-relationships" on node m and finally delete node n and all it's relationships. Better solution would be make a good create queries so you merge duplicates during creating nodes and relationships and you don't have to do it later.Gondil
@Gondi: There are 2 zip code nodes, and only one of those is involved in a relationship. So, the results from the 2 queries make perfect sense.cybersam

1 Answers

3
votes

[UPDATED]

Aside: You have 2 nodes with the same zip code, but only one of those nodes has a relationship. This explains your results thus far.

In neo4j 3.x, you can install the APOC plugin and use the mergeNodes() procedure, which takes a collection of nodes. It merges the properties and relationships of the 2nd through last nodes onto the first node, and deletes the 2nd through last nodes.

For example:

MATCH (z:ZipCode)
WHERE z.zipcode = "2014 AAE"
WITH COLLECT(z) AS zs
CALL apoc.refactor.mergeNodes(zs) YIELD node
RETURN node;