I want to implement a unique ID property on all nodes in my database but need to apply it to existing data. I'm using Ruby to perform generate the IDS and then running the Cypher query from there. I want to avoid one query to find nodes missing the property, another to set the property on each node individually, since that would require total_nodes + 1
queries.
Initially, I was thinking I could do something like this:
MATCH (n:`#{label}`) WHERE NOT HAS(n.my_id) SET n.my_id = '#{gen_method}' RETURN DISTINCT(true)
Of course, this wouldn't work because it would call gen_method
once in Ruby and then Neo4j would try to set all nodes IDs to that one value.
I'm thinking now that it might be best to generate a large number of IDs in Ruby first, then include that in the Cypher query. I'd like to loop through the matched nodes and set the missing property equal to its corresponding index in the array. The logic should go something like this
MATCH NODES WHERE GIVEN PROPERTY IS NULL, LIMIT TO 10,000
CREATE A COLLECTION OF THOSE NODES
SET NEW UUIDS ARRAY (provided by Ruby) AS "IDS_ARRAY"
FOR EACH NODE IN COLLECTION
SET GIVEN PROPERTY VALUE = CORRESPONDING INDEX POSITION IN "IDS_ARRAY"
RETURN COUNT OF NODES WHERE GIVEN PROPERTY IS NULL
Based on the return value, it would know how many more times to do this. Cypher has a foreach loop but how I do this, especially if my unique_ids
array is starting from a string in the Cypher query?
unique_ids = ['first', 'second', 'third', 'etc']
i = 0
for node in matched_nodes
node.my_id_property = unique_ids[i]
i += 1
end
Is it even possible? Is there a different way of handling this that will work?