How to avoid cartesian-product in a cypher query and still create links between objects?

Question

I imported a table with thousands of Equipments. Then imported another table with types of equipments, which contain around 20 types.

When I wrote the cypher query below to associate them, Neo4j warned me about a cartesian product. Is there a better way to create the associations? Should I have done it during the CSV import?

MATCH (te:Equipment_Type),(e:Equipment)
WHERE te.type_id = e.type_id
CREATE (e)-[:TYPE_OF]→(te)

Update

I tryed what Brian sugested, during the CSV import, and worked like a charm.

Imported the Equipment Types first;
Then created and index on Equipment(type_id);
Modified the code to search during CSV import.

From Neo4j Console:

Added 100812 labels, created 100812 nodes, set 414307 properties, created 100812 relationships, statement executed in 33902 ms.

The Code:

CREATE INDEX ON :Equipment(type_id)

USING PERIODIC COMMIT 1000
LOAD CSV WITH HEADERS FROM "http://localhost/Equipments.csv" AS row
MERGE (e:Equipment {eqp_id: row.eqp_id, name: row.name, type_id: row.type_id})
WITH e, row
MATCH (te:Equipemnt_Type)
WHERE te.type_id = row.type_id
CREATE (e)-[:TYPE_OF]->(te)

Brian Underwood Brian Underwood · Accepted Answer · 2015-11-09T13:36:15

With the size of data that you're talking about it's not a big deal, especially if you have indexes on Equipment_Type:type_id and Equipment:type_id. It's warning you because a cartesian project in a query can seem quick when you first write it on a small dataset and then grow quickly as you get more data.

But yes, creating the relationships during the CSV import would be the best way to approach it, probably.

How to avoid cartesian-product in a cypher query and still create links between objects?

1 Answers