0
votes

I have the following data structure:

  • Order -> Contact -> Install -> Campaign
  • Order -> Contact -> Download -> Campaign

I have created the following Cypher Query:

MATCH (ca1:Campaign) - [CI] - (i:Installs) - [IC] - (co1:Contact) - [CO1] - (o1:Order), 
(ca2:Campaign) - [CD] - (d:Downloads) - [DC] - (co2:Contact) - [CO2] - (o2:Order) 
where i.DownloadDate > '6/1/16' and i.DownloadDate < '7/31/16' 
and d.DownloadDate > '6/1/16' and d.DownloadDate < '7/31/16'
RETURN ca1,CI,i,IC,co1,CO1,o1,ca2,CD,d,DC,co2,CO2,o2 limit 50

CQ is giving the following waring:

This query builds a cartesian product between disconnected patterns. If a part of a query contains multiple disconnected patterns, this will build a cartesian product between all those parts. This may produce a large amount of data and slow down query processing. While occasionally intended, it may often be possible to reformulate the query that avoids the use of this cross product, perhaps by adding a relationship between the different parts or by using OPTIONAL MATCH (identifiers are: (ca2, d, co2, o2))

Is there a better way to code in CQL?. (Sorry for the newbie question). Thanks.

2
I'm rather confused here, your query make me think that you have a missing comma in your path, that it should be "->Campaign, Order->", indicating a different path, instead of "-> Campaign Order -> Contact". If that's the case, it looks like what you really have is Order -> Contact and then Contact has both Install and Download relationships to Campaign.InverseFalcon
You are right.. there should be a comma at the end of first campaign. I had it in two lines. Post merged into a single line. Essentially Contact has connections to two nodes (install and download) and then they connect back to Campaign node.Ravi
Okay, that helps. Can you make it clear what it is you want the query to do (a verbal description rather than making an attempt at the query), and provide any additional information, such as if you're starting from certain node(s) by id or something, or if you want this information for everything matching in your db? Also, can you explain why Download and Install are two different nodes? I get the feeling that you might be able to optimize something here, but not sure yet.InverseFalcon
Can you explain in a bit more detail what result you are trying to achieve?imran arshad
The 2 separate paths you've described don't explain the graph well enough as it doesn't capture how the two paths interact with each other. Can we see a screenshot of a sample graph please?Haoyang Feng

2 Answers

0
votes

Making a first pass at this, but my guess is you're going for is something more like all campaigns, installs/downloads of those campaigns, contacts for those installs/downloads, and associated orders, where the install or order date falls between the dates provided. Is that correct?

Your query as it stands is a cartesian product of everything (every single row of your download match, in every single possible combination against every possible row of your orders match), when my guess is what you really want is a union of the matches with downloads and orders, or if they can be treated the same, a query that matches against both.

My next question would be, do you really need every single element of the match? As it stands now you're returning every node of the match and every relationship between each node. Do you really need all that info in this query, or are the nodes enough? Also, do you need full nodes, or do you just need properties within each node?

Additionally, specifying label and direction on your relationships would be a vast improvement and should speed up your query.

Assuming that all you need are the nodes, that you can treat install and download node data polymorphically, that the relationship labels between installs/downloads are the same, and making a stab at adding labels and directions to your relationships (please correct me on those so I can fix), this may work better for you:

MATCH (campaign:Campaign)<-[:Has]-(installOrDownload)<-[:Has]-(contact:Contact)<-[:Has]-(order:Order)
where installOrDownload.DownloadDate > '6/1/16' and installOrDownload.DownloadDate < '7/31/16' 
RETURN campaign, installOrDownload, contact, order limit 50

Provided that the relationship labels linking your installs/downloads with campaigns, and linking contacts with installs/downloads are the same (and if not there are ways around that too), installOrDownload will match on both Install and Download nodes.

-2
votes

I don't think you can code for nodes that have multiple relationships due to the fact that nodes are monogamous by default.