I think one solution might be to "only keep the first1 annual spend per project". More abstractly, "only keep the first value in column(s) X
per column(s)Y
".
Below is some mock/dummy data. I only want to keep the highlighted values in my annual spend
column (as the highlighted values are the first "annual spend" figures per "project").
This is the M
code I'm using to achieve this. (To try it, open the Query Editor > Advanced Editor (near top right) > copy-paste code below to there > OK
).
let
OnlyKeepFirstValueInColumn = (someTable as table, columnsToNullify as list) as table =>
let
firstRow = Table.FirstN(someTable, 1), // This assumes first row contains a non-blank value.
remainingRows = Table.Skip(someTable, 1),
loopAndNullify = List.Accumulate(columnsToNullify, remainingRows, (tableState, currentHeader) => Table.TransformColumns(tableState, {{currentHeader, each null}})),
combined = firstRow & loopAndNullify
in combined,
FirstValueOfColumnsPerGroup = (someTable as table, groupByColumns as list, columnsToNullify as list) =>
let
group = Table.Group(someTable, groupByColumns, {{"toCombine", each OnlyKeepFirstValueInColumn(_, columnsToNullify), type table}}),
combined = Table.Combine(group[toCombine])
in combined,
aggregatedTable = Table.FromColumns({Text.ToList("aaabbbccccdddeeefg"), List.Repeat({1000}, Text.Length("aaabbbccccdddeeefg"))}, type table [project=text, annual spend=number]),
transformed = FirstValueOfColumnsPerGroup(aggregatedTable, {"project"}, {"annual spend"})
in
transformed
The important bit to understand is this line:
transformed = FirstValueOfColumnsPerGroup(aggregatedTable, {"project"}, {"annual spend"})
in which you should replace:
aggregatedTable
with whatever variable/expression contains your table
{"project"}
with the name of your "project" column (keep the curly braces {}
though as they let you pass in several columns if needed)
{"annual spend"}
with the names of whichever column(s) you want to keep only the first value in (keep the curly braces {}
)
This is what I get (which I think is similar to what you want):
1To keep things simple, we'll say "first" here means the value in the first row. It could have meant "first non-null value" or "first value satisfying some particular condition or logic", but your data suggests the simpler definition will work okay.