I have 3 nodes created in graph database, origin airport and destination airport and carrier. They are related by a property named 'canceled_by'.
MATCH (origin:origin_airport {name: row.ORIGIN}),
(destination:dest_airport {name: row.DEST}),
(carrier:Carrier {name: row.UNIQUE_CARRIER})
CREATE (origin)-[:cancelled_by {cancellation: row.count}]->(carrier)
CREATE (origin)-[:cancelled_by {cancellation: row.count}]->(destination)
CREATE (origin)-[:operated_by {carrier: row.UNIQUE_CARRIER}]->(carrier)
cancelled_by holds the value number of times particular carrier is cancelled. My input file will be of below format:
ORIGIN UNIQUE_CARRIER DEST Cancelled
ABE DL ATL 1
ABE EV ATL 1
ABE EV DTW 3
ABE EV ORD 3
ABQ DL DFW 2
ABQ B6 JFK 2
Here I need to calculate the cancellation percentage of each carrier. I am expecting result as below:
UNIQUE_CARRIER DEST Percentage_Cancelled
DL 25%
EV 58.33%
B6 16.66%
Example: Total number of cancellation = 12
No of cancellation for DL = 3
Percentage of cancellation for DL = (3/12)*100 = 25%
Below query gives the sum of cancellation for each carrier:
MATCH ()-[ca:cancelled_by]->(c:Carrier)
RETURN c.name AS Carrier,
SUM(toFloat(ca.cancellation)) As sum
ORDER BY sum DESC
LIMIT 10
I tried the below query for calculating percentage:
MATCH ()-[ca:cancelled_by]->(c:Carrier)
WITH SUM(toFloat(ca.cancellation)) As total
MATCH ()-[ca:cancelled_by]->(c:Carrier)
RETURN c.name AS Carrier,
(toFloat(ca.cancellation)/total)*100 AS percent
ORDER BY percent DESC
LIMIT 10
But it is not calculating percentage by grouping and instead calculating the percentage individually.
Carrier sum
DL 0.36862408915559364
DL 0.34290612944706383
DL 0.3171881697385341
How to calculate percentage based on group_by using cypher queries in Neo4j?