2
votes

I am converting an rdb to neo4j using csv batch import of py2neo which was described in here. I organized the code like this:

delivery = "Delivery.csv"
graph = Graph("http://localhost:7474/db/data/")
graph.cypher.execute("CREATE CONSTRAINT ON (delivery:Delivery) ASSERT delivery.name IS UNIQUE")
with open(delivery, 'r+') as in_file:

    reader = csv.reader(in_file, delimiter=';')
    next(reader, None)
    batch = graph.cypher.begin()

    try:
        i = 0;
        j = 0;
        for row in reader:
            if row:
                name = strip(row[0])
                created_by = strip(row[1])
                created_on = strip(row[2])
                description = strip(row[3])
                delivered_to = strip(row[4])
                delivered_by = strip(row[5])
                barcode = strip(row[6])

                query = """
                    merge (delivery:Delivery {name:{a}})
                    merge (email:Email {email:{b}})
                    merge (created_on:Created_On {created_on:{c}})
                    merge (description:Description {describe:{d}})
                    merge (email:Email {email:{e}})
                    merge (email:Email {email:{f}})
                    merge (barcode:Barcode {code:{g}})
                  ##and there are some relationships##
                   """
                   batch.append(query, {"a": delivery, "b": created_by, "c": created_on "d": description, "e": delivered_to,
                                     "f": delivered_by, "g": barcode})
                  .........................................................
                  .........code goes as described in the hypherlink........

The "email" node was created before from another "people.csv" file and there is uniqueness constraint on "email" node. In the above lines, I want to merge "created_by", "delivered_to" and "delivered_by" nodes with the "email" node which was created before, via the relationships. When I run the code it gave the error email already declared and didn't create the any node from the csv. How can the cypher query be organized to prevent this error? Thanks.

1

1 Answers

3
votes

You have 3 nodes that you are trying to call email in the query. You need to rename two of them so that there aren't any duplicates.

query = """
     merge (delivery:Delivery {name:{a}})
     merge (email:Email {email:{b}})
     merge (created_on:Created_On {created_on:{c}})
     merge (description:Description {describe:{d}})
     merge (email:Email {email:{e}})                # <- pick a different name
     merge (email:Email {email:{f}})                # <- pick a different name
     merge (barcode:Barcode {code:{g}})

When MATCHing or MERGEing, if you provide a name before the : (like email and others in your example above), you are binding that matched or merged node to that name. You cannot provide the same name for multiple nodes.

If you are not going to use these names later in your query, I suggest not naming them. For example:

merge (:Email {email:{b}})

will merge your Email node without holding a reference to it. From what I can tell from your code, I think you could just exclude the names.