I am struggling to figure out how to do a nested GROUP BY in Gremlin.
My graph is like this (vertices in [...]):
[survey]<-- forSurvey--[ce]--collectedFrom-->[ce sample]--currentResponseState-->[responseState]
- 1 survey has many CEs.
- 1 CE has many CE samples.
- Each CE sample has a current response state.
- There are 3 possible states (not started, in progress, completed).
I want to count the CE samples by CE and response state. For example:
CE=CE01, State=not started, count=45.
CE=CE01, State=in progress, count=34.
CE=CE01, State=completed, count=25.
CE=CE02, State=not started, count=60.
CE=CE02, State=in progress, count=33.
CE=CE02, State=completed, count=72.
CE=CE03, State=not started, count=44. etc
I can get the group counts using the following query, which displays the correct counts for each status (in a separate column for each state) with a separate row for each CE, but it doesn't show the CE name:
g.V().hasLabel("survey").has("surveyId","LFS")
.in("forSurvey").as("ce")
.local(out("collectedFrom")
.out("currentResponseState")
.groupCount().by("code")).as("counts")
.select("counts")
But I cannot figure out how get Gremlin to return the CE name together with the counts. If I include the "ce" in the "select()" projection to get the name of the CE vertex, I get no results at all.
This would be trivial in SQL, where I would just GROUP BY the relevant columns, but Gremlin is a different world.
Any tips on how to do this with DSE Graph 5.0 Gremlin would be greatly appreciated.