2
votes

Given that I'm very new to Neo4j. I have a schema which looks like the below image: enter image description here

Here Has nodes are different for example Passport, Merchant, Driving License, etc. and also these nodes are describing the customer node (looking for future scope of filtering customers based on these nodes).

SIMILAR is a self-relation meaning there exists a customer with ID:1 is related to another customer with ID:2 with a score of 2800.

I have the following questions:

  1. Is this a good schema given the condition of the future scope I mentioned above, or getting all the properties in a single customer node is viable? (Different nodes may have array of items as well, for example: ()-[:HAS]->(Phone) having {active: "+91-1231241", historic_phone_numbers: ["+91-121213", "+91-1231421"]})
  2. I want to get the customer along with describing nodes in relation to other customers. For that, I tried the below query (w/o number of relation more than 1):
// With number_of_relation > 1
MATCH (searched:Customer)-[r:SIMILAR]->(matched:Customer)
WHERE r.score > 2700
WITH searched, COLLECT(matched.customer_id) AS MatchedList, count(r) as cnt
WHERE cnt > 1
UNWIND MatchedList AS matchedCustomer
MATCH (person:Customer {customer_id: matchedCustomer})-[:HAS|:LIVES_IN|:IS_EMPLOYED_BY]->(related)
RETURN searched, person, related

Result what I got is below, notice one customer node not having its describing nodes:

enter image description here

// without number_of_relation > 1
// second attempt - for a sample customer_id
MATCH (matched)<-[r:SIMILAR]-(c)-[:HAS|:LIVES_IN|:IS_EMPLOYED_BY]->(b)
WHERE size(keys(b)) > 0
AND c.customer_id = "1b093559-a39b-4f95-889b-a215cac698dc"
AND r.score > 2700
RETURN b AS props, c AS src_cust, r AS relation,  matched

Result I got are below, notice related nodes are not having their describing nodes:

enter image description here

  1. If I had two describing nodes with some property (some may have a list) upon which I wanted to query and build the expected graph specified in point 2 above, how can I do that?

  2. I want the database to find a similar customer given the describing nodes. Example: A customer {name: "Dave"} has phone {active_number: "+91-12345"} is similar to customer {name: "Mike"} has phone {active_number: "+91-12345"}. How can get started with this?

If something is unclear, please ask. I can explain with examples.

1

1 Answers

1
votes

[EDITED]

  1. Yes, the schema seems fine, except that you should not use the same HAS relationship type between different node label pairs.
  2. The main problem with your first query is that its top MATCH clause uses a directional relationship pattern, ()-->(), which does not allow all Customer nodes to have a chance to be the searched node (because some nodes may only be at the tail end of SIMILAR relationships). This tweaked query should work better:

    MATCH (searched:Customer)-[r:SIMILAR]-(matched:Customer)
    WHERE r.score > 2700
    WITH searched, COLLECT(matched) AS matchedList
    WHERE SIZE(matchedList) > 1
    UNWIND matchedList AS person
    MATCH (person)-[:HAS|LIVES_IN|IS_EMPLOYED_BY]->(pDesc)
    WITH searched, person, COLLECT(pDesc) AS personDescribers
    MATCH (searched)-[:HAS|LIVES_IN|IS_EMPLOYED_BY]->(sDesc)
    RETURN searched, person, personDescribers, COLLECT(sDesc) AS searchedDescribers
    
  3. It's not clear what you want are trying to do.

  4. To get all Customers who have the same phone number:

    MATCH (c:Customer)-[:HAS_PHONE]-(p:Phone)
    WHERE p.activeNumber = '+91-12345'
    WITH p.activeNumber AS phoneNumber, COLLECT(c) AS customers
    WHERE SIZE(customers) > 1
    RETURN phoneNumber, customers