0
votes

I'm constructing a data set based on sources from the late seventeenth and eighteenth centuries, detailing the relations between laypeople and various churches in the Dutch Republic. I would like to create a simple query that counts the number of baptisms and marriages that occurred in a particular church and period. I used this data to play around with:

CREATE (em0002: event: marriage { day: 16, month: 6, year: 1650, LOCAL_ID: 'em0002'})
CREATE (em0003: event: marriage { day: 16, month: 6, year: 1670, LOCAL_ID: 'em0003'})
CREATE (eb0001: event: baptism { day: 16, month: 6, year: 1665, LOCAL_ID: 'eb0001'})
CREATE (c0001: church: RCC { name: 'Kerk aan de Appelmarkt', location: 'Leiden', LOCAL_ID: 'c0001'})
CREATE (c0003: church: DRC { name: 'Gereformeerde Kerk', location: 'Leiden', LOCAL_ID: 'c0003'})
CREATE (em0001)-[:TOOK_PLACE_IN]->(c0001)
CREATE (em0003)-[:TOOK_PLACE_IN]->(c0001)
CREATE (eb0001)-[:TOOK_PLACE_IN]->(c0001)
CREATE (em0002)-[:TOOK_PLACE_IN]->(c0003)

It is fairly easy to retrieve the number of baptisms in a particular period, which can be done by this query:

WHERE e.year > 1650 AND e.year < 1680
RETURN count (e:baptism) AS Number_of_baptisms

The problem is when I want to create an overview with the number of baptisms in one column and the number of marriages in the other.

I simply don't know how to link the two queries (one counting the marriages and the other the baptisms. This one doesn't give the right result:

WHERE e.year_trouw > 1650 AND e.year_trouw < 1680 
MATCH (ee:baptism)-[:TOOK_PLACE_IN]->(ch:church:RCC {name: 'Kerk aan de Appelmarkt'})
WHERE ee.year > 1650 AND ee.year < 1680
RETURN count (e:marriage) AS Number_of_marriages, count (ee:baptism) AS Number_of_baptisms
2

2 Answers

0
votes

Perhaps something like this? Tailor according to your logic

MATCH (e1:baptism), (e2:marriage)
WITH {baptism:COUNT(DISTINCT e1), marriage:COUNT(DISTINCT e2)} AS col
RETURN col.baptism AS Baptism, col.marriage AS Marriage
0
votes

This query uses the REDUCE function:

MATCH (event:event)
WHERE 1650 < event.year < 1680
RETURN REDUCE(s={b:0, m:0}, e IN COLLECT(event) |
  CASE WHEN e:baptism THEN {b:s.b+1, m:s.m} ELSE {b:s.b, m:s.m+1} END
) AS counts

to produce the baptism/marriage counts:

╒═════════════╕
│"counts"     │
╞═════════════╡
│{"b":1,"m":1}│
└─────────────┘

The query, as written, assumes that any event that is not a baptism is a marriage.

BTW: An expression like count(e:baptism) does more work the way you intended. e:baptism evaluates the e node to see if it has the baptism label and returns a boolean value. So, you are actually counting the total number of trues and falses (i.e., the total number of e nodes). Instead, count(e) would give you the same count and is more efficient.