0
votes

I use an OpenStreetMap dataset in Neo4j, that I want to restructure. I know how I can make unique Street and Zip labelled nodes, by using MERGE().

MERGE(street:Street {street_name: n.`addr:street`})
MERGE(zip:ZipCode {zipcode: n.`addr:postcode`})

But I also want Housenumber labelled nodes, numbers can sometimes be the same, but those numbers never point to the same street. I think MERGE() isn't fully suitable for this.

So, I want the structure to be something like this:

Street<-number_in_street<-Housenumber
ZipCode<-number_in_zipcode<-Housenumber

Coolstreet<-number_in_street<-20A (Unique Housenumber node 1)
Otherstreet<-number_in_street<-20A (Unique Housenumber node 2)
5680 PC<-number_in_zipcode<-20A (Unique Housenumber node 1)
5680 PC<-number_in_zipcode<-20A (Unique Housenumber node 2)

How can I achieve this structure, with Cypher, by using an OpenStreetMap dataset in Neo4j?

Edit: I don't want to duplicate the street names, to have a certain combination with a house number. I want street and housenumber as a seperate node (to prevent duplication). One unique street needs to point to several housenumbers, that are in that street.

So I have blank labbeled nodes like this:

addr:housenumber:199A
addr:street:Coolstreet
source:BAG
addr:postcode:5414 AP

That needs to be splitted to Street, Housenumber and Zipcode, whill having the requested structure.

1
Do you want a new Housenumber node for every address? Or a set of unique numbers for both streets and zipcodes? And why? Would help to describe your case a bit more. - Martin Preusse
@MartinPreusse To prevent duplication, see my edit for a example of a blank node, that needs to be "splitted" with having the requested structure. - Jeroen Steen
I don't get it. What defines the 'uniqueness' of the housenumber nodes you are talking about? Why not just MERGE housenumber nodes like you MERGE the others and create relationships? - Martin Preusse

1 Answers

1
votes

MERGE will prevent duplication exactly the way you want. If you want to get the housenumber and street together on a node so that you can apply a uniqueness constraint, you're misusing the uniqueness constraint; it's really for optimizing index lookups, deduplicating is just a side effect.

Something like this should work:

WITH n
MERGE (z:ZipCode {zipcode: n.`addr:postcode`})
MERGE (s:Street {street_name: n.`addr:street`})
MERGE (s) - [:NUMBER_IN_STREET] -> (n:HouseNumber {house_number: n.`addr:housenumber`})
MERGE (z) - [:NUMBER_IN_ZIPCODE] -> (n)

MERGEing the HouseNumber as part of the whole pattern ensures that it is unique for that street name. You can even put a regular index (non-unique) on the house_number property to speed things up a bit.