I am trying to build a graph of different entities liked by people on Facebook to create a basic cross domain recommendation engine.
I have got data for different entities (movies, books, music, etc). Nodes are created for each item with properties as name of the item (name of the movie, book, etc) and entity type of the item (movie, book, etc). Any two nodes have relationships between them called "affinity". This relationship also has a "strength" property, which is equal to the no. of people who have liked these two items.
I use FB users to connect these nodes. FB users also are nodes in the graph with properties as name of the person and type as person. The relationship between these nodes and item nodes is called 'likes'. Now if a person has liked a movie, I would like to recommend him books or music by traversing the graph. This is the cypher query I am trying to traverse the graph:
START root = node(<LIKED_MOVIE_NODE_ID>)
MATCH p = root-[rel1:affinity*..3]-(movies)<-[rel2:likes]-(persons)-[rel3:likes]->(books)
WHERE HAS(movies.type) and movies.type = "movies" and HAS(persons.type) and persons.type = "person" and HAS(books.type) and books.type = "books"
RETURN books
This runs very slow, sometimes taking upto 500 secs. I have got some 13000 movies, 2000 books and 3000 music nodes. Connecting them are 16000 people. All together there are some 300,000 relationships.
My questions are :
Am I doing something wrong? Is there a better way to do this? I am new to neo4j. I have tried some of the techniques for tuning the neo4j graphDB. I have increased the min heap size to 4 GB and am running it on a 8 core machine with 32 GB RAM.
I want to know the strength of the relationships rel1 and number of rel2 and rel3. Rel1 has got a property strength. I am not able to find it out,
Please advise as I am on the verge of giving up neo4j and going back to SQL. Atleast it works. :(
Regds, Paritosh