0
votes

I have 2 tables that I'm joining with an Excel Power Query Merge. I have an Hours table and a Parts table. I'm using a Left Outer join on Task_ID to keep all rows from the Hours table and add the matching rows from the Parts table. This is fine and pulls in the Part No. and Cost from the Parts table but because my common field (Task_ID) is not unique I'm getting duplicate rows which is expected but I don't want to duplicate values in the Sum of Hours column (created from an earlier join using an aggregate function).

Is my desired output even possible or is there a better way altogether to go about this?

I want for example to get a Sum of Hours per Week per Task and a Total Cost per Week per Task

Hours

| Week | Task_ID | Sum of Hours |
| ---- |-------- |------------- |
| 01   | ABC#1   | 1            |
| 01   | ABC#2   | 3            |
| 02   | ABC#2   | 3            |

Parts

| Task_ID | Part No. | Cost |
| ------- | -------- | ---- |
| ABC#1   | 111      | 10   |
| ABC#1   | 222      | 20   |
| ABC#2   | 111      | 10   |
| ABC#2   | 333      | 30   |
| ABC#2   | 444      | 40   | 

Desired Output

| Week | Task_ID | Sum of Hours | Part No. | Cost |
| ---- | ------- | ------------ | -------- | ---- |
| 01   | ABC#1   | 1            | 111      | 10   |
| 01   | ABC#1   | null         | 222      | 20   |
| 01   | ABC#2   | 3            | 111      | 10   |
| 01   | ABC#2   | null         | 222      | 20   |
| 01   | ABC#2   | null         | 333      | 30   |
| 02   | ABC#2   | 3            | 111      | 10   |
| 02   | ABC#2   | null         | 222      | 20   |
| 02   | ABC#2   | null         | 333      | 30   |
1

1 Answers

0
votes

Am unsure if the desired output shown in the question is exactly correct or meant as a rough example. For example, it doesn't seem to contain the row where Task_ID is ABC#2 and Part No. is 444, even though ABC#2 appears in the Hours table.

The code below should give you something similar to what you've shown for the Sum Of Hours column in the expected output (i.e. the original value for the first occurrence, then null for any recurrences).

let
    hours = Table.FromColumns({
        {"01", "01", "02"},
        {"ABC#1", "ABC#2", "ABC#2"},
        {1, 3, 3}
    }, type table [Week = text, Task_ID = text, Sum Of Hours = text]),
    parts = Table.FromColumns({
        {"ABC#1", "ABC#1", "ABC#2", "ABC#2", "ABC#2"},
        {111, 222, 111, 333, 444},
        {10, 20, 10, 30, 40}
    }, type table [Task_ID = text, #"Part No." = text, Cost = text]),
    joined = Table.NestedJoin(hours, "Task_ID", parts, "Task_ID", "_joined", JoinKind.LeftOuter),
    nestedIndex = Table.TransformColumns(joined, {"_joined", each Table.AddIndexColumn(_, "_index"), type table}),
    expanded = Table.ExpandTableColumn(nestedIndex, "_joined", {"Part No.", "Cost", "_index"}),
    madeDuplicatesNull = Table.FromRecords(Table.TransformRows(expanded, each 
        Record.RemoveFields(_, {"_index", "Sum Of Hours"}) 
        & [Sum Of Hours = if [_index] > 0  then null else [Sum Of Hours]]
    ))
in
    madeDuplicatesNull

However, if the aim here is just to compute some numbers per week per task, you can probably compute the figure directly (de-duplicating where necessary) without expanding the rows? Something like:

let
    hours = Table.FromColumns({
        {"01", "01", "02"},
        {"ABC#1", "ABC#2", "ABC#2"},
        {1, 3, 3}
    }, type table [Week = text, Task_ID = text, Sum Of Hours = text]),
    parts = Table.FromColumns({
        {"ABC#1", "ABC#1", "ABC#2", "ABC#2", "ABC#2"},
        {111, 222, 111, 333, 444},
        {10, 20, 10, 30, 40}
    }, type table [Task_ID = text, #"Part No." = text, Cost = text]),
    joined = Table.NestedJoin(hours, "Task_ID", parts, "Task_ID", "_joined", JoinKind.LeftOuter),
    grouped = Table.Group(joined, {"Week", "Task_ID"}, {
        {"Sum Of Hours", each List.Single([Sum Of Hours]), type number},
        {"Cost", each
            let
                aggregated = Table.AggregateTableColumn(_, "_joined", {{"Cost", List.Sum, "Cost"}}),
                cost = List.Single(aggregated[Cost])
            in cost
        , type number}
    })
in
    grouped