0
votes

Say I have a fairly large JSON object. This object can be nested in many ways and include arrays. The object gives me a user and their relationship to singular objects of a type and to multiple (array) objects of a type.

My goal is to insert this into Neo4j as fast and efficiently as possible.

Can this be done, and is it advised to do, in one query that is a concatenated string created by iterating the JSON structure? (Parsing the object and abstracting into multiple nodes and relationships)

Also a common issue that is making this process difficult for me: For the arrays that I'm iterating, when I concatenate MERGE statements, nodes in the Neo4J get duplicated, and the merge doesnt seem to work.

//Person to interests
_.each(interests, function(itr){
    ingestQuery += 'MERGE(centerRep)-[:INTERESTED_IN]->(:Interest{name: "'+itr.interest_name+'", category: "'+itr.interest_category+'"})'
},this)

So if this statement is ran twice, the interest node would be created twice, which is unwanted behavior.

1
"Can this be done, and is it advised to do, in one query that is a concatenated string created by iterating the JSON structure?" - Not sure what you're asking: You want to store a large string in Neo4j? You want to parse a JSON-formatted string into multiple nodes? You're trying to build an importer? Please edit your question to clarify.David Makogon
Also: You have an additional question (regarding merging), which you should consider asking separately.David Makogon
Edited, and if its simple enough to answer here please do so, to me it is part of my problem (trying to do everything with one query vs separating it out into multiple)rambossa

1 Answers

1
votes

You can pass the JSON as a parameter to a Cypher query and use UNWIND to iterate through arrays in the JSON.

For example (taken from this blog post), let's say your JSON looks like this:

{ "items": [{
"question_id": 24620768,
"link": "http://stackoverflow.com/questions/24620768/neo4j-cypher-query-get-last-n-elements",
"title": "Neo4j cypher query: get last N elements",
"answer_count": 1,
"score": 1,
.....
"creation_date": 1404771217,
"body_markdown": "I have a graph....How can I do that?",
"tags": ["neo4j", "cypher"],
"owner": {
    "reputation": 815,
    "user_id": 1212067,
    ....
    "link": "http://stackoverflow.com/users/1212067/"
},
"answers": [{
    "owner": {
        "reputation": 488,
        "user_id": 737080,
        "display_name": "Chris Leishman",
        ....
    },
    "answer_id": 24620959,
    "share_link": "http://stackoverflow.com/a/24620959",
    ....
    "body_markdown": "The simplest would be to use an ... some discussion on this here:...",
    "title": "Neo4j cypher query: get last N elements"
}]
}

Passing this JSON object as a parameter to a Cypher query to insert it into the graph would look like this:

WITH {json} as data
UNWIND data.items as q
MERGE (question:Question {id:q.question_id}) ON CREATE
  SET question.title = q.title, question.share_link = q.share_link, question.favorite_count = q.favorite_count
MERGE (owner:User {id:q.owner.user_id}) ON CREATE SET owner.display_name = q.owner.display_name
MERGE (owner)-[:ASKED]->(question)
FOREACH (tagName IN q.tags | MERGE (tag:Tag {name:tagName}) MERGE (question)-[:TAGGED]->(tag))
FOREACH (a IN q.answers |
MERGE (question)<-[:ANSWERS]-(answer:Answer {id:a.answer_id})
MERGE (answerer:User {id:a.owner.user_id}) ON CREATE SET answerer.display_name = a.owner.display_name
MERGE (answer)<-[:PROVIDED]-(answerer))

There are a few more examples of this here and here.

Regarding your MERGE issue. MERGE looks at the entire pattern and will "get or create" based on the entire pattern specified. Usually what you want is to MERGE on a single node property to ensure that node is not duplicated instead of a larger pattern. Look at the MERGE section of this blog post for more detail.