I have a large graph in which there are nodes representing people. All of them have firstname and surname properties, some have middlename properties. I'm looking for nodes that might represent the same person, so am looking at the different permutations of names. I'm currently comparing surnames and the first initial of firstnames [ some nodes just have initials ], but can't figure out how to test middlenames if they exist.
My current query is:
match (a:Author), (b:Author)
where
a.surname=b.surname and
( a.firstname starts with 'A' and b.firstname starts with 'A')
return distinct a,b
My understanding is that OPTIONAL MATCH refers only to patterns, so that won't work. I can't find a way to write an if statement that makes sense.
It may be that it makes more sense for me to do this programmatically, rather than relying just on direct Cypher queries, but I was hoping to keep it really simple and just do it in Cypher.
Some examples to clarify what I want to do.
Example 1:
Node 1: firstname "John" middlename "Patrick" lastname "Smith"
Node 2: firstname "J" middlename "P" lastname "Smith"
Node 3: firstname "J" middlename "Q" lastname "Smith"
Node 4: firstname "J" lastname "Smith"
I want a query that will return nodes 1, 2, and 4 as 'matching'.
Example 2:
Node 1: firstname "Jane" lastname "Smith"
Node 2: firstname "J" middlename "P" lastname "Smith"
Node 3: firstname "J" middlename "Q" lastname "Smith"
Node 4: firstname "J" lastname "Smith"
Here, I want all 4 nodes, since the 'canonical' name doesn't have a middle name.
EXISTS
and get down-voted for it. – joslinm