2
votes

I have a query that is returning the total number of entries in a collection per year-month, grouped by a location.

This is returning data exactly as I need it if the location has results for the year-month in question.

However, is it possible to insert an entry for a month that does not have a result? For instance lets say if my $match has a date range of 01-2019 to 12-2019. I would like to have all 12 entries for the month with a default of total: 0.

Truncated Schema :

{
  branchId: { type: String, required: true },
  orgId: { type: String, required: true },
  stars: { type: Number, default: 0 },
  reviewUpdatedAt: { type: Date, default: Date.now }
}

What I've tried:

[
    {
        $match: {
            stars: { $exists: true, $gte: 1 },
            orgId: '100003'
            reviewUpdatedAt: { $gte: new Date(fromDate), $lte: new Date(toDate) }
        }
    },
  {
    $group: {
      _id: {
        date: {
          $dateToString: {
            format: "%m-%Y",
            date: "$reviewUpdatedAt"
          }
        },
        loc: "$branchId"
      },
      total: {
        $sum: 1
      }
    }
  },
  {
    $group: {
      _id: "$_id.loc",
      reviews: {
        $push: {
          total: "$total",
          "date": "$_id.date"
        }
      }
    }
  }
]
3

3 Answers

1
votes

At first I thought this can be easily achieved through code, but even with MongoDB you can do that but with an input from code :

Let's say if your fromDate is June-2018 & toDate is June-2019, then by using your programming language you can easily get all months between those two dates in this format mm-yyyy. You can try to do this using MongoDB but I would rather prefer as an input to query.

Query :

db.collection.aggregate([
    {
      $group: {
        _id: {
          date: {
            $dateToString: {
              format: "%m-%Y",
              date: "$reviewUpdatedAt"
            }
          },
          loc: "$branchId"
        },
        Total: {
          $sum: 1
        }
      }
    },
    {
      $group: {
        _id: "$_id.loc",
        reviews: {
          $push: {
            Total: "$Total",
            "date": "$_id.date"
          }
        }
      }
    },
    /** Overwrite existing reviews field with new array, So forming new array :: 
     * as you're passing all months between these dates get a difference of two arrays (input dates - existing dates after group)
     * while will leave us with an array of missing dates, we would iterate on that missing dates array &
     * concat actual reviews array with each missing date
     * */
    {
      $addFields: {
        reviews: {
          $reduce: {
            input: {
              $setDifference: [
                [
                  "06-2018",
                  "07-2018",
                  "08-2018",
                  "09-2018",
                  "10-2018",
                  "11-2018",
                  "12-2018",
                  "01-2019",
                  "02-2019",
                  "03-2019",
                  "04-2019",
                  "05-2019",
                  "06-2019"
                ],
                "$reviews.date"
              ]
            },
            initialValue: "$reviews",
            in: {
              $concatArrays: [
                "$$value",
                [
                  {
                    date: "$$this",
                    Total: 0
                  }
                ]
              ]
            }
          }
        }
      }
    }
  ])

Test : MongoDB-Playground

Ref : javascript-get-all-months-between-two-dates

0
votes

so step back and realize you seek a display of data that doesn't exist in the db...let's say there is no data for 3/19. this is not a mongo issue but universal for any db. one creates a 'time table' in your case perhaps it is month/year...and for mongo it is documents/collection...this provides framework data for each month for the initial match..to which one's join ($lookup in mongo) will have null for 3/19...

adding a time table is standard in analytic apps -some come with that feature embedded as part of their time based analytics feature so the database doesn't need to do anything.....but to do so via general query/reporting in mongo and sql databases one would need to manually add that time collection/table

0
votes

Starting in Mongo 5.1release schedule, it will be a perfect use case for the new $densify aggregation operator:

// { date: "02-2019", value: 12 }
// { date: "03-2019", value: 2 }
// { date: "11-2019", value: 3 }
db.collection.aggregate([
  { $set: {
    date: { $dateFromString: { // "02-2019" => ISODate("2019-04-01")
      dateString: { $concat: [ "01-", "$date" ] },
      format: "%d-%m-%Y"
    }}
  }},
  { $densify: {
    field: "date",
    range: {
      step: 1,
      unit: "month",
      bounds: [ISODate("2019-01-01"), ISODate("2020-01-01")]
    }
  }},
  { $set: {
    value: { $cond: [ { $not: ["$value"] }, 0, "$value" ] },
    date: { $dateToString: { format: "%m-%Y", date: "$date" } } // ISODate("2019-04-01") => "02-2019"
  }}
])
// { date: "01-2019", value: 0 }
// { date: "02-2019", value: 12 }
// { date: "03-2019", value: 2 }
// { date: "04-2019", value: 0 }
// { date: "05-2019", value: 0 }
// { date: "06-2019", value: 0 }
// { date: "07-2019", value: 0 }
// { date: "08-2019", value: 0 }
// { date: "09-2019", value: 0 }
// { date: "10-2019", value: 0 }
// { date: "11-2019", value: 3 }
// { date: "12-2019", value: 0 }

This:

  • casts date strings into dates (the first $set stage)
  • densifies documents ($densify) by creating new documents in a sequence of documents where certain values for a field (in our case field: "date") are missing:
    • the step for our densification is 1 month: range: { step: 1, unit: "month", ... }
    • and we densify within the range of dates provided with bounds: [ISODate("2019-01-01"), ISODate("2020-01-01")]
  • sets dates back to date strings: date: { $dateToString: { format: "%m-%Y", date: "$date" } }
  • and also sets ($set) views to 0 only for new documents included during the densify stage ({ value: { $cond: [ { $not: ["$value"] }, 0, "$value" ] })