I'm struggling with Cypher to perform a sampling over my user activity.
What does my graph look like
I have a few millions users recorder in my Graph with an indexed property UserId.
I have few hundreds Products with the indexed property ItemId.
My users can [INTERACTS] with my products.
What I'm trying to do
I would like to have an average idea over the path user-[INTERACTS]-product-[INTERACTS]-user-[INTERACTS]-product . In plain english I would like to know which products the look-alike user are interacting with. Eg if I interacts with products A and B, users interacting with these products generally interacts with these other products.
It's what Amazon do.
My problem
I can't simply match the above pattern, it takes way too long to execute. So I thought that I can only sample my user and that I could do the following :
- Take only the last 20 products my user interacts with
- Take only the last 20 users using each of the above products
- Take only the last 20 products these users interacted with, and counting each product occurrence
But I don't know if this is even possible in one single Cypher query.
The closest I came was the following query. But it still is too long, and it does not do what I want. Basically it gives me the latest products used by the "latest" user which has the "latest" products in common with the root user, which seems logic to me, but fails to sample my users' activity.
START u=node:node_auto_index('UserId:9554')
MATCH
u-[i1:INTERACTS]-p1
WITH
u,p1,i1
LIMIT 20
MATCH
p1-[i2:INTERACTS]-u1
WHERE
NOT(u1=u)
WITH
i1,i2,u1,p1
LIMIT 400
MATCH
u1-[i3:INTERACTS]-p
WHERE
NOT(p1=p) AND p.ProjectId = {ProjectId} AND p.IsActive? = 1
RETURN
i1.Label, i2.Label,i3.Label, p.ItemId,count(p) as count
LIMIT 8000
Where am I now
After some more unsuccessful testing I tried to code it using the Java API, and it is way more simpler and straight forward. But out of curiosity and because for now my system use Cypher I would like to know how to do this
For the sake of testing I think my question could be reduce to : Given a pattern, what is the last 2 node of each depth.
I created http://console.neo4j.org/?id=inf2hn in order to test it. I think the final result I'm looking for should look like :
Product 2 | User 3 | Product 5
Product 2 | User 3 | Product 6
Product 2 | User 4 | Product 5
Product 2 | User 4 | Product 6
Product 3 | User 3 | Product 5
Product 3 | User 3 | Product 6
Product 3 | User 4 | Product 5
Product 3 | User 4 | Product 6
Thanks for your help