0
votes

I have two data sources that I wish to combine.

Source 1

Data source on Sheet1 contains 2 columns: AccountID and Cost. Source is in cells A3:B3 (data row starts in A4). Data continues for 1 to N rows. A secondary cell contains the year value for all records to be used in this source. It is in cell D1.

Source 2

Data source on Sheet2 contains 6 columns and in range A1:F1 (data starts on row 2): Year, AccountID, Location, Phone#, Email type, & Cost. Data continues for 1 to N rows.

I wish to join the records of both tables based on the AccountID of each range of data (Sheet1 column A, Sheet2 column B), where the year (sourced on Sheet1!D1) is added to all records on the Sheet2 data, and records matching by AccountID contain included information when joining data.

Example #1

AccountID "1234" is on Sheet1 but not on Sheet2. Year value (Sheet1!D1) contains 2020. The end result would have a new row added with the AccountID of "1234" into Sheet2, the associated Cost value, and a year of 2020.

Example #2

AccountID "1234" is on sheet1 and also on Sheet2. The record on Sheet2 has a year value of 2019. The record on Sheet1 has the year 2020 associated with it (Sheet1!D1 cell value). The end result would have two rows of data for AccountID of "1234" in Sheet2, one for the 2019 record and one for 2020 record, as found on Sheet1 data source. The latter would not include Location, Phone#, or Email type values, as they aren't specified anywhere.

Images

Sheet 1:

Sheet 1

Sheet 2:

Sheet 2

Sheet2 after data is populated from sheet1:

Sheet 2 after data is populated from sheet 1

1
David, I have a solution, but the question was closed for reasoning I don't understand. I'm too new to answering questions on this forum to know what else to do.Zack Barresse

1 Answers

1
votes

This can be accomplished with Power Query. The following queries assume that you have created Tables with your data, named "Table1" on Sheet1 and "Table2" on Sheet2. This will require 5 separate queries: 2 for the Tables of data, one function, a merge table, and one final query to combine the data. For all of these, add a new blank query (Data tab > Get Data > From Other Sources > Blank Query). You will also need to create a named range for cell D1 on Sheet1 titled "YearValue".

YearValue function

Paste the below code into the Advanced Editor (Power Query > View > Advanced Editor). Change the name (on the right) to "YearValue".

let
    Source = Excel.CurrentWorkbook(){[Name="YearValue"]}[Content]{0}[Column1]
in
    Source

Close & load to connection only.

Table 1 query

Paste the below code into the Advanced Editor (Power Query > View > Advanced Editor). Change the name (on the right) to "Table1".

let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    TypeColumns = Table.TransformColumnTypes(Source,{{"AccountID", Int64.Type}, {"Cost", Int64.Type}}),
    AddYearColumn = Table.AddColumn(TypeColumns, "Year", each YearValue)
in
    AddYearColumn

Close & load to connection only.

Table 2 query

Paste the below code into the Advanced Editor (Power Query > View > Advanced Editor). Change the name (on the right) to "Table2".

let
    Source = Excel.CurrentWorkbook(){[Name="Table2"]}[Content],
    TypeColumns = Table.TransformColumnTypes(Source,{{"Year", Int64.Type}, {"AccountID", Int64.Type}, {"Location", type text}, {"Phone#", type text}, {"Email type", type text}, {"Cost", Int64.Type}})
in
    TypeColumns

Close & load to connection only.

Merge table query

Paste the below code into the Advanced Editor (Power Query > View > Advanced Editor). Change the name (on the right) to "MergeTable".

let
    Source = Table.FromColumns({List.Distinct(List.Select(Table1[AccountID]&Table2[AccountID],each _ <> null))},type table[AccountID = text]),
    MergeQueries = Table.NestedJoin(Source,{"AccountID"},Table1,{"AccountID"},"Table1",JoinKind.LeftOuter),
    ExpandedTable1 = Table.ExpandTableColumn(MergeQueries, "Table1", {"AccountID", "Cost", "Year"}, {"Table1.AccountID", "Table1.Cost", "Table1.Year"}),
    MergeQueries2 = Table.NestedJoin(ExpandedTable1,{"AccountID"},Table2,{"AccountID"},"Table2",JoinKind.LeftOuter),
    ExpandedTable2 = Table.ExpandTableColumn(MergeQueries2, "Table2", {"Year", "AccountID", "Location", "Phone#", "Email type", "Cost"}, {"Table2.Year", "Table2.AccountID", "Table2.Location", "Table2.Phone#", "Table2.Email type", "Table2.Cost"}),
    RemoveColumns = Table.RemoveColumns(ExpandedTable2,{"AccountID", "Table2.Year", "Table2.AccountID", "Table2.Cost"}),
    ReorderColumns = Table.ReorderColumns(RemoveColumns,{"Table1.Year", "Table1.AccountID", "Table2.Location", "Table2.Phone#", "Table2.Email type", "Table1.Cost"}),
    RenameColumns = Table.RenameColumns(ReorderColumns,{{"Table1.Year", "Year"}, {"Table1.AccountID", "AccountID"}, {"Table2.Location", "Location"}, {"Table2.Phone#", "Phone#"}, {"Table2.Email type", "Email type"}, {"Table1.Cost", "Cost"}})
in
    RenameColumns   

Close & load to connection only.

Combine tables query

Paste the below code into the Advanced Editor (Power Query > View > Advanced Editor). Change the name (on the right) to "CombineTables".

let
    Source = Table.Combine({Table2, MergeTable}),
    FilterRows = Table.SelectRows(Source, each ([Year] <> null)),
    SortRows = Table.Sort(FilterRows,{{"AccountID", Order.Ascending}})
in
    SortRows

Close & load worksheet (as desired).

HTH