4
votes

I imported a table with thousands of Equipments. Then imported another table with types of equipments, which contain around 20 types.

When I wrote the cypher query below to associate them, Neo4j warned me about a cartesian product. Is there a better way to create the associations? Should I have done it during the CSV import?

MATCH (te:Equipment_Type),(e:Equipment)
WHERE te.type_id = e.type_id
CREATE (e)-[:TYPE_OF]→(te)

Update

I tryed what Brian sugested, during the CSV import, and worked like a charm.

  1. Imported the Equipment Types first;
  2. Then created and index on Equipment(type_id);
  3. Modified the code to search during CSV import.

From Neo4j Console:

Added 100812 labels, created 100812 nodes, set 414307 properties, created 100812 relationships, statement executed in 33902 ms.

The Code:

CREATE INDEX ON :Equipment(type_id)

USING PERIODIC COMMIT 1000
LOAD CSV WITH HEADERS FROM "http://localhost/Equipments.csv" AS row
MERGE (e:Equipment {eqp_id: row.eqp_id, name: row.name, type_id: row.type_id})
WITH e, row
MATCH (te:Equipemnt_Type)
WHERE te.type_id = row.type_id
CREATE (e)-[:TYPE_OF]->(te)
1

1 Answers

2
votes

With the size of data that you're talking about it's not a big deal, especially if you have indexes on Equipment_Type:type_id and Equipment:type_id. It's warning you because a cartesian project in a query can seem quick when you first write it on a small dataset and then grow quickly as you get more data.

But yes, creating the relationships during the CSV import would be the best way to approach it, probably.