has anybody experienced issues with the aggregate pipeline? Specifically using the Mongo API? I have quite large query with a few stages, the $group stage is not able to find a particular field in the output of the previous $project stage although if I comment out the $group stage I can definitely see the field I am expecting being output as part of the $project.
The following sets up two collections, each with a single document for demonstration purposes.
db.vacancies.insert({
"_id" : NUUID("35b3068c-d300-4ae1-bf45-cb00f0f7c37b"),
"employerAccountId" : "MYJR4X",
"vacancyReference" : NumberLong(1000000021),
"status" : "Closed",
"applicationMethod" : "ThroughPhone",
"closingDate" : ISODate("2019-03-01T00:00:00.000Z")
});
db.applicationReviews.insert({
"_id" : NUUID("43a7764a-7d4d-465b-9c30-9934848c009c"),
"candidateId" : NUUID("84dc9645-605c-4606-812b-117b090020b0"),
"vacancyReference" : NumberLong(1000000021),
"status" : "New",
"createdDate" : ISODate("2018-09-27T13:46:09.801Z")
});
The following is the aggregate pipeline query:
db.vacancies.aggregate([
{
"$match":{
"employerAccountId":"MYJR4X"
}
},
{
"$lookup":{
"from":"applicationReviews",
"localField":"vacancyReference",
"foreignField":"vacancyReference",
"as":"application"
}
},
{
"$unwind":{
"path":"$application",
"preserveNullAndEmptyArrays":true
}
},
{
"$project":{
"vacancyGuid":"$_id",
"vacancyReference":1,
"status":1,
"appStatus":"$application.status",
"closingDate":1,
"applicationMethod":1
}
},
{
"$project":{
"vacancyGuid":1,
"vacancyReference":1,
"status":1,
"closingDate":1,
"applicationMethod":1,
"isNew":{
"$cond":{
"if":{
"$eq":[
"$appStatus",
"New"
]
},
"then":1,
"else":0
}
},
"isSuccessful":{
"$cond":{
"if":{
"$eq":[
"$appStatus",
"Successful"
]
},
"then":1,
"else":0
}
},
"isUnsuccessful":{
"$cond":{
"if":{
"$eq":[
"$appStatus",
"Unsuccessful"
]
},
"then":1,
"else":0
}
}
}
},
{
"$group":{
"_id":{
"vacancyGuid":"$vacancyGuid",
"vacancyReference":"$vacancyReference",
"status":"$status",
"applicationMethod": "$applicationMethod",
"closingDate":"$closingDate"
},
"noOfNewApplications":{
"$sum":"$isNew"
},
"noOfSuccessfulApplications":{
"$sum":"$isSuccessful"
},
"noOfUnsuccessfulApplications":{
"$sum":"$isUnsuccessful"
}
}
}
]);
I get the following result when using MongoDB:
{
"_id" : {
"vacancyGuid" : NUUID("35b3068c-d300-4ae1-bf45-cb00f0f7c37b"),
"vacancyReference" : NumberLong(1000000021),
"status" : "Closed",
"applicationMethod" : "ThroughPhone",
"closingDate" : ISODate("2019-03-01T00:00:00.000Z")
},
"noOfNewApplications" : 1.0,
"noOfSuccessfulApplications" : 0.0,
"noOfUnsuccessfulApplications" : 0.0
}
The above is what I expect, however when using Cosmos DB, either hosted on Azure or the Azure Cosmos Emulator I get the following result:
{
"_id" : {
"vacancyGuid" : NUUID("35b3068c-d300-4ae1-bf45-cb00f0f7c37b"),
"vacancyReference" : 1000000021,
"status" : "Closed",
"closingDate" : ISODate("2019-03-01T00:00:00.000Z")
},
"noOfNewApplications" : 1.0,
"noOfSuccessfulApplications" : 0.0,
"noOfUnsuccessfulApplications" : 0.0
}
^^ note the applicationMethod
field is not returned.
I have tried the following:
- commenting out/removing the group
phase of the aggregate pipeline - the projection phases return applicationMethod
correctly
- aliasing the applicationMethod
field in the projection stages to another name e.g. am
, didn't make a difference to the grouping.
- changing the name of the aliased collection in the as
field of the lookup
phase to something other than application
I have the following assertions:
application
is a reserved keyword, which has an effect even when used as a prefix to a field name.application
is a reserved keyword in Cosmos only (Is this because of the Application reserved word in SQL? (SQL API))
I know calling my source field on the vacancies
collection something other than applicationMethod
would solve the issue but that is not practical for existing documents and also the naming I would like to continue using. It seems there is a problem with Cosmos DB and the way it handles the group
stage of the aggregate pipeline.