I created a simple example dataset of three words: w1
with no searches, w2
with 3 searches and w3
with 6.
CREATE (w1:Word {w: '1'})
WITH count(*) AS dummy
CREATE (w2:Word {w: '2'}) WITH w2
UNWIND range(1, 3) AS i
CREATE (w2)<-[:Searched]-(:Session)
WITH count(*) AS dummy
CREATE (w3:Word {w: '3'}) WITH w3
UNWIND range(1, 6) AS i
CREATE (w3)<-[:Searched]-(:Session)
I would approach it like this: first, let's create a list with the upper limits for each bucket:
RETURN [i IN range(0, 4) | i*5] AS upperLimits
╒══════════════╕
│"upperLimits" │
╞══════════════╡
│[0,5,10,15,20]│
└──────────────┘
Second, use this with a list comprehension that selects the elements from the list that has a sufficiently large upper limit. The first one of these is our bucket, so we select that with the [0]
list indexer. The rest is just calculating the lower limit and ordering rows:
WITH [i IN range(0, 4) | i*5] AS upperLimits
MATCH (n:Word)
WITH upperLimits, ID(n) AS n, size((n)<-[:Searched]-()) AS numOfRelationships
WITH
[upperLimit IN upperLimits WHERE numOfRelationships <= upperLimit][0] AS upperLimit,
count(n) AS count
RETURN
upperLimit - 4 AS lowerLimit,
upperLimit,
count
ORDER BY lowerLimit
The query gives the following results:
╒════════════╤════════════╤═══════╕
│"lowerLimit"│"upperLimit"│"count"│
╞════════════╪════════════╪═══════╡
│-4 │0 │1 │
├────────────┼────────────┼───────┤
│1 │5 │1 │
├────────────┼────────────┼───────┤
│6 │10 │1 │
└────────────┴────────────┴───────┘
Potential improvements:
(1) If the value of numOfRelationships
is larger than the largest upper limit, the query above will return the first element of an empty list, which is null
. To avoid that, either 1) set a sufficiently large upper limit, e.g.
MATCH (n:Word)
WITH max(size((n)<-[:Searched]-())) AS maxNumberOfRelationShips
WITH [i IN range(-1, maxNumberOfRelationShips/5+1) | {lower: i*5-4, upper: i*5}] AS limits
RETURN *
You can use the top bucket with "16 or larger" semantics with coalesce
.
(2) -4
as a lower limit is not very nice, we can use CASE
to get rid of it.
Putting all this together, we get this:
MATCH (n:Word)
WITH max(size((n)<-[:Searched]-())) AS maxNumberOfRelationShips
WITH [i IN range(0, maxNumberOfRelationShips/5+1) | i*5] AS upperLimits
MATCH (n:Word)
WITH upperLimits, ID(n) AS n, size((n)<-[:Searched]-()) AS numOfRelationships
WITH
[upperLimit IN upperLimits WHERE numOfRelationships <= upperLimit][0] AS upperLimit,
count(n) AS count
RETURN
CASE WHEN upperLimit - 4 < 0 THEN 0 ELSE upperLimit - 4 END AS lowerLimit,
upperLimit,
count
ORDER BY lowerLimit
Which results in:
╒════════════╤════════════╤═══════╕
│"lowerLimit"│"upperLimit"│"count"│
╞════════════╪════════════╪═══════╡
│0 │0 │1 │
├────────────┼────────────┼───────┤
│1 │5 │1 │
├────────────┼────────────┼───────┤
│6 │10 │1 │
└────────────┴────────────┴───────┘