0
votes

I am trying to create a count of dates that occur during particular weeks (the last five weeks including this week) and a total running count for dates which would include records prior to the listed five week duration.
I could do this fairly easily using standard Excel formula, but I'm trying to learn PowerQuery and the M language.

My first query returns the relevant rows from the original source data (another query that pulls everything from the Excel table and forces the correct data type for each column).
The dates I'm trying to count are contained in the Section A#(lf)Turnbacks.
The query is called Triage_Turnbacks.

let
    Source = #"Source_Data",
    #"Filter_Rows" = Table.SelectRows(Source, each ([#"Cancelled?"] = null) and ([#"Section A#(lf)Turnbacks"] <> null)),
    #"Removed_Columns" = Table.SelectColumns(Filter_Rows,{"VENDOR CODE", "Supplier Lookup", "Section A#(lf)Turnbacks"}),
    #"Duplicate_Date_Column" = Table.DuplicateColumn(Removed_Columns, "Section A#(lf)Turnbacks", "Start_Of_Week"),
    #"Start_of_Week" = Table.TransformColumns(#"Duplicate_Date_Column",{{"Start_Of_Week", Date.StartOfWeek, type date}})
in
    Start_of_Week   

| VENDOR CODE | Supplier Lookup | Section ATurnbacks | Start_Of_Week |
|-------------|-----------------|--------------------|---------------|
| A1          | Supplier A      | 08/04/2019         | 08/04/2019    |
| A1          | Supplier A      | 08/04/2019         | 08/04/2019    |
| A2          | Supplier B      | 16/04/2019         | 15/04/2019    |
| A3          | Supplier C      | 30/05/2019         | 27/05/2019    |
| A10         | Supplier D      | 24/04/2019         | 22/04/2019    |
| A5          | Supplier E      | 18/06/2019         | 17/06/2019    |
| A1          | Supplier A      | 17/07/2019         | 15/07/2019    |
| A1          | Supplier A      | 24/06/2019         | 24/06/2019    |
| A2          | Supplier B      | 11/06/2019         | 10/06/2019    |
| A3          | Supplier C      | 03/07/2019         | 01/07/2019    |
| A10         | Supplier D      | 02/07/2019         | 01/07/2019    |
| A5          | Supplier E      | 17/07/2019         | 15/07/2019    |
| A1          | Supplier A      | 17/07/2019         | 15/07/2019    |
| A1          | Supplier A      | 05/08/2019         | 05/08/2019    |
| A2          | Supplier B      | 12/07/2019         | 08/07/2019    |
| A1          | Supplier A      | 05/08/2019         | 05/08/2019    |
| A1          | Supplier A      | 29/07/2019         | 29/07/2019    |
| A2          | Supplier B      | 12/06/2019         | 10/06/2019    |
| A3          | Supplier C      | 21/02/2019         | 18/02/2019    |
| A10         | Supplier D      | 23/04/2019         | 22/04/2019    |
| A5          | Supplier E      | 14/06/2019         | 10/06/2019    |
| A10         | Supplier D      | 08/02/2019         | 04/02/2019    |

A second query returns the previous five week Monday and Sunday dates.
The query is called 5_Week_Calendar.

    let
        Source = Table.FromList({-5..0}, each{_}),
        AddedStartOfWeek = Table.AddColumn(Source, "StartOfWeek", each Date.StartOfWeek(Date.AddDays(Date.From(DateTime.LocalNow()),[Column1]*7)), type date),
        AddedEndOfWeek = Table.AddColumn(AddedStartOfWeek, "EndOfWeek", each Date.EndOfWeek([StartOfWeek]), type date),
        RemovedColumn = Table.RemoveColumns(AddedEndOfWeek,{"Column1"})
    in RemovedColumn  

| StartOfWeek | EndOfWeek  |
|-------------|------------|
| 01/07/2019  | 07/07/2019 |
| 08/07/2019  | 14/07/2019 |
| 15/07/2019  | 21/07/2019 |
| 22/07/2019  | 28/07/2019 |
| 29/07/2019  | 04/08/2019 |
| 05/08/2019  | 11/08/2019 |

My third query returns the count per week (i.e. which records in Triage_Turnbacks.Start_Of_Week matches 5_Week_Calendar.StartOfWeek.

let
    Source = Table.NestedJoin(#"5_Week_Calendar", {"StartOfWeek"}, Triage_Turnbacks, {"Start_Of_Week"}, "Triage_Turnbacks", JoinKind.LeftOuter),
    #"Removed Columns" = Table.RemoveColumns(Source,{"EndOfWeek"}),
    #"Aggregated Triage_Turnbacks" = Table.AggregateTableColumn(#"Removed Columns", "Triage_Turnbacks", {{"Start_Of_Week", List.Count, "Total Turnbacks"}})
in
    #"Aggregated Triage_Turnbacks"  

    | StartOfWeek | Total Turnbacks |
    |-------------|-----------------|
    | 01/07/2019  | 2               |
    | 08/07/2019  | 1               |
    | 15/07/2019  | 3               |
    | 22/07/2019  | 1               |
    | 29/07/2019  | 1               |
    | 05/08/2019  | 2               |

The problem

How do I add the running total which includes records before the initial 5 week date of 01/07/2019?

| StartOfWeek | Total Turnbacks | Running Total |
|-------------|-----------------|---------------|
| 01/07/2019  | 2               | 15            |
| 08/07/2019  | 1               | 16            |
| 15/07/2019  | 3               | 19            |
| 22/07/2019  | 1               | 19            |
| 29/07/2019  | 1               | 20            |
| 05/08/2019  | 2               | 22            |
1

1 Answers

1
votes

A running total is simple in concept but, as far as I know, M's standard library does not currently provide anything like List.CumulativeSum or Table.CumulativeSum. This makes things a bit awkward and a custom function will likely be needed.

A running total can be achieved with something like:

ListCumulativeSum = (listOfNumbers as list) as list => List.Accumulate(listOfNumbers, {}, (listState, currentItem) => listState & {List.Last(listState, 0) + currentItem}),

which, given a list of numbers, returns a list of numbers.

The equivalent for tables might be the function below. (I am sure there are many ways of doing this; some more performant/scalable than others.)

TableCumulativeSum = (someTable as table, columnToSum as text, newColumnName as text) =>
    let
        runningTotal = ListCumulativeSum(Table.Column(someTable, columnToSum)),
        rightTable = Table.FromColumns({{0..List.Count(runningTotal)-1}, runningTotal}, {"$rightJoinKey", newColumnName}),
        leftTable = Table.AddIndexColumn(someTable, "$leftJoinKey"),
        joined = Table.Join(leftTable, {"$leftJoinKey"}, rightTable, {"$rightJoinKey"}, JoinKind.FullOuter),
        dropped = Table.RemoveColumns(joined, {"$leftJoinKey", "$rightJoinKey"})
    in dropped,

Overall and going back to your question, I think it can be achieved with the code below:

let
    // This is just a function which returns a record. The record itself
    // contains two values:
    //    • a list of dates (which will be the "left join column")
    //    • a replacer function (which conditionally replaces dates)
    GetReplacerAndDates = (n as number, replaceWith as any) as record => 
        let
            startOfCurrentWeek = Date.StartOfWeek(DateTime.Date(DateTime.LocalNow())),
            nMondaysAgo = Date.AddWeeks(startOfCurrentWeek, -n),
            defaultAndLastNWeeks = {replaceWith} & List.Dates(nMondaysAgo, n + 1, #duration(7, 0, 0, 0)),
            conditionalReplacer = (someDate as date) =>
                let
                    startOfWeek = Date.StartOfWeek(someDate),
                    startOfWeekOrReplacement = if startOfWeek >= nMondaysAgo then startOfWeek else replaceWith
                in startOfWeekOrReplacement,
            toReturn = [replacer = conditionalReplacer, values = defaultAndLastNWeeks]
        in toReturn,
    rec = GetReplacerAndDates(5, "Prior period"), // Needs a better variable name.

    // You don't need this below, I only used it to give me a starting point
    // and so that I could verify my answer.
    triageTurnbacks = Table.FromColumns(
        {
            {"A1","A1","A2","A3","A10","A5","A1","A1","A2","A3","A10","A5","A1","A1","A2","A1","A1","A2","A3","A10","A5","A10"},
            {"Supplier A","Supplier A","Supplier B","Supplier C","Supplier D","Supplier E","Supplier A","Supplier A","Supplier B","Supplier C","Supplier D","Supplier E","Supplier A","Supplier A","Supplier B","Supplier A","Supplier A","Supplier B","Supplier C","Supplier D","Supplier E","Supplier D"},
            {#date(2019,4,8), #date(2019,4,8), #date(2019,4,16), #date(2019,5,30), #date(2019,4,24), #date(2019,6,18), #date(2019,7,17), #date(2019,6,24), #date(2019,6,11), #date(2019,7,3), #date(2019,7,2), #date(2019,7,17), #date(2019,7,17), #date(2019,8,5), #date(2019,7,12), #date(2019,8,5), #date(2019,7,29), #date(2019,6,12), #date(2019,2,21), #date(2019,4,23), #date(2019,6,14), #date(2019,2,8)}
        },
        type table [Vendor Code = text, Supplier Lookup = text, Section A Turnbacks = date]
    ),
    // I think the single step "startOfWeekColumn" below should probably replace the #"Duplicate_Date_Column"
    // and #"Start_of_Week" steps of your Triage_Turnbacks query.
    // You could also assign rec[replacer] to a variable,
    // just to make the code/call site easier to read.
    startOfWeekColumn = Table.AddColumn(triageTurnbacks, "startOfWeek", each rec[replacer]([Section A Turnbacks]), type any),
    datesToJoin = Table.FromColumns({rec[values]}, {"startOfWeek"}),
    joined = Table.NestedJoin(datesToJoin, "startOfWeek", startOfWeekColumn, "startOfWeek", "Total Turnbacks", JoinKind.LeftOuter),
    reduced = Table.TransformColumns(joined, {{"Total Turnbacks", Table.RowCount, type number}}),

    // Helper functions. I don't think cumulative summation can be achieved natively (as far as I know).
    ListCumulativeSum = (listOfNumbers as list) as list => List.Accumulate(listOfNumbers, {}, (listState, currentItem) => listState & {List.Last(listState, 0) + currentItem}),
    TableCumulativeSum = (someTable as table, columnToSum as text, newColumnName as text) as table =>
        let
            runningTotal = ListCumulativeSum(Table.Column(someTable, columnToSum)),
            rightTable = Table.FromColumns({{0..List.Count(runningTotal)-1}, runningTotal}, {"$rightJoinKey", newColumnName}),
            leftTable = Table.AddIndexColumn(someTable, "$leftJoinKey"),
            joined = Table.Join(leftTable, {"$leftJoinKey"}, rightTable, {"$rightJoinKey"}, JoinKind.FullOuter),
            dropped = Table.RemoveColumns(joined, {"$leftJoinKey", "$rightJoinKey"}),
            changedTypes = Table.TransformColumnTypes(dropped, {{newColumnName, type number}})
        in changedTypes,
    runningTotal = TableCumulativeSum(reduced, "Total Turnbacks", "Running Total")
in
    runningTotal

which gives me this:

Query output

which I think matches your expected output with the exception of the Total Turnbacks column of the week starting 22/07/2019 (where you have 1 and I have 0).

You could obviously filter/remove the "Prior period" row from the final table. For further details, please see comments in the code. If they don't help, let me know.