0
votes

The scenario and data structure is very simple.

I have a list with the product code and the month that this product have been retailed. A example of such data can be seen at the first two columns in green at the image below.

Data Structure

Then I need to check for each product If it was retailed also on the last month, on the last 3 months or in the last 12 months. The result would be the next three columns in yellow on the image.

These calculations (yellow columns) are easy to be computed at Excel by using some IF and COUNTIFS formulas, but when migrating it to Power BI I'm struggling with the performance of my code at Power Query. As there are thousands of products for each month, the Power Query calculation is taking too long.

Check below the code I've designed. The code snapshot would be for the second yellow column, to advise whether there was a retail on the last 3 months or not of that product.

In essence what I'm doing is adding a calculated column that is counting the rows of a table that is being filtered with the product code information and relevant date.

What would be a better approach in terms of performance to get the information I need?

Thank you.

Code:

// Add a calculated column.
AdicionarHits03Meses = Table.AddColumn(
    AdicionarHits01Mes,
    "Hit nos últimos 3 meses?",
    (r)=> 
        
        // Check if...
        if    
        
            // Returns the rows count of a table.
            Table.RowCount(
                
                // Returns a table with the condition.
                Table.SelectRows(
                    ChangeType,

                    // Condition:
                    (q)=> 

                        // Same Product Code.
                        q[#"Product Code"] = r[#"Product Code"] 
                        
                        and 
                        
                        // Check the retail month.
                        q[#"Retail month"] <= Date.AddMonths(r[#"Retail month"], -1) and
                        q[#"Retail month"] >= Date.AddMonths(r[#"Retail month"], -3)
                )
            )

        = 0 then
        
        // No retail found.
        0 else

        // Retail found.
        1
    
    ,
    Int64.Type
)
1
It seems to me that your function is overkill? Why use Table.SelectRows within your Table.AddColumn function? Why not just use each within the Table.AddColumn function to evaluate the expression in line with your rows? Your function has no actual outer references? Given your desired output, I don't see why that is required? For example, I don't see why the r function could not just be each if [#"Retail month"] <= Date.AddMonths([#"Retail month"], -1) and [#"Retail month"] >= Date.AddMonths([#"Retail month"], -3) then 1 else 0?trenton-ftw

1 Answers

0
votes

You can likely improve performance with some clever self-merges.

See if this makes sense to you:

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUfJKzFMwMjAyVIrViVYyQhcwRhcAaXFLTUIV8E0sQjXDN7ESzdDSHKhALAA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"Product Code" = _t, #"Retail Month" = _t]),
    Original = Table.TransformColumnTypes(Source,{{"Product Code", Int64.Type}, {"Retail Month", type date}}),
    #"Added Offset Lists" = Table.AddColumn(Original, "Offset", each {1..12}),
    #"Expanded Offset Column" = Table.ExpandListColumn(#"Added Offset Lists", "Offset"),
    #"Added Prev Column" = Table.AddColumn(#"Expanded Offset Column", "Prev", each Date.AddMonths([Retail Month], -[Offset] ), type date),
    #"Inner Join Prev to Original" = Table.NestedJoin(#"Added Prev Column", {"Product Code", "Prev"}, Original, {"Product Code", "Retail Month"}, "Retail", JoinKind.Inner),
    #"Merge Original to Prev" = Table.NestedJoin(Original, {"Product Code", "Retail Month"}, #"Inner Join Prev to Original", {"Product Code", "Retail Month"}, "Min Offset", JoinKind.LeftOuter),
    #"Expanded Min Offset" = Table.TransformColumns(#"Merge Original to Prev", {{"Min Offset", each List.Min([Offset]), Int64.Type}}),
    #"Added Last Month" = Table.AddColumn(#"Expanded Min Offset", "Retail last month", each if [Min Offset] = 1 then "Yes" else "No", type text),
    #"Added Last 3 Months" = Table.AddColumn(#"Added Last Month", "Retailed since last 3 months", each if [Min Offset] <> null and [Min Offset] <= 3 then "Yes" else "No", type text),
    #"Added Last 12 Months" = Table.AddColumn(#"Added Last 3 Months", "Retailed since 12 months", each if [Min Offset] <> null and [Min Offset] <= 12 then "Yes" else "No", type text)
in
    #"Added Last 12 Months"

I don't have time to fully explain it but the outline is roughly as follows:

  1. Expand each row to 12 rows of prior months.
  2. Match up those prior months with rows from the original table.
  3. Join the original rows with all of the matches found for that row.
  4. Find the most recent match (minimal offset) for each matching set.
  5. Define the 1, 3, and 12-month lookback columns using this offset.

Final Result Screenshot