1
votes

How do you quickly get the maximum (or minimum) value for a property of all instances of a relationship? You can assume the machine I'm running this on is well within the recommended spec's for the cpu and memory size of graph and the heap size is set accordingly.

Facts:

  • Using Neo4j v2.2.3
  • Only have access to modify graph via Cypher query language which I'm hitting via PHP or in the web interfacxe--would love to avoid any solution that requires java coding.
  • I've got a relationship, call it likes that has a single property id that is an integer.
  • There's about 100 million of these relationships and growing
  • Every day I grab new likes from a MySQL table to add to the graph within in Neo4j
  • The relationship property id is actually the primary key (auto incrementing integer) from the raw MySQL table.
  • I only want to add new likes so before querying MySQL for the new entries I want to get the max id from the likes, so I can use it in my SQL query as SELECT * FROM likes_table WHERE id > max_neo4j_like_property_id

How can I accomplish getting the max id property from neo4j in a optimal way? Please indicate the create statement needed for any index as well as the query you'd used to get the final result.

I've tried creating an index as follows:

 CREATE INDEX ON :likes(id);

After the index is online I've tried:

 MATCH ()-[r:likes]-() RETURN r.i ORDER BY r.id DESC LIMIT 1

as well as:

 MATCH ()-[r:likes]->() RETURN MAX(r.id) 

They work but take freaking forever as the explain plan for both indicate no indexes being used.

UPDATE: Holy $?@#$?!!!! It looks like the new schema indexes aren't functional for relationships even though you can create them and show them with :schema. It also looks as if there's no way with cypher directly to create Legacy Indexes which look like they might solve this issue.

1
I would probably just store the max id you inserted last time somewhere in Neo4j or MySQL and update it after your batch of inserts is done.Michael Hunger
Schema indexes were never meant for relationships, what you did was to create a node-label index for the label :likesMichael Hunger
@MichaelHunger I see. I didn't expect neo4j to allow me to create an index on a non-existing node label, but your explaination makes sense. I'm pretty sure I'll have to go the route of creating an additional way to track this--maybe just creating a tracking node with a single value IO update. I was hoping it would be something that would be possible intrinsicly. Also sometimes I just want to grab a specific relationship by id--this doesn't seem to work either.Ray
@MichaelHunger Do you know if there is going to be a future plan to allow indexing on relationships via schema indexes or other wise?Ray
I guess I could make each likes relationship a node, then instead of like TYPE relationship directly between two of the people nodes, I could make a wonky work around: (n:person {id: 'bob'} )-->(:likes {id: 3})-->(:person {id: 'jim'} )Ray

1 Answers

2
votes

If you need to query relationship properties, it is generally a sign of a model issue.

The need of this query reveals you that you would better extract these properties into a node, that you'll then be able to query faster.

I don't say it is 100% the case, but certainly 99% of the people seen so far with the same problem has been demonstrating this model concern.

What is your model right now ?

Also you don't use labels at all in your query, likes have a context bound to the nodes.