1
votes

Hello I am trying to a solve a simple problem but i haven't been able to find a solution on Google or anywhere. I have multiple Excel files with tables inside them that are identical. I started a PowerQuery to load all the Tables objects to my data model and I need now is a way to merge all the tables together based on a common PrimaryKey.

All the tables in the files are identical in structure and have the same number of rows but different columns. There is only 1 PrimaryID column and it is called the same in all tables.

I was able to figure it out doing all the merging between the files manually 1 table at a time but I was wondering if Power Query is able to automate this process by joining all tables located in a folder in a similar manner. (i.e. merging Table 1 ---LEFT JOIN---> Table 2 --LEFT JOIN--> Table 3)

1
Are there the same number of columns in each file?Alexis Olson

1 Answers

0
votes

Let's say I have a couple of files in one folder. I can load from that folder as my data connection and it should look like this:

Load from folder

Now I'll add a custom column that reads the binary file in each row and returns a table. My files are formatted as CSV, so I'd write this:

= Table.AddColumn(Source, "Custom", each Table.PromoteHeaders(Csv.Document([Content])))

I'll remove all the other columns except for this new one and then use Table.Combine to turn that column into a combined table:

= Table.Combine(#"Removed Other Columns"[Custom])

Combine tables

Here the first four rows are from FileA and the last four are from FileBV.

This isn't quite what I want, but I can remove those null values by unpivoting the non-ID columns and then re-pivot to get back to the shape I want. (There's probably a better way I'm not thinking of.)

Unpivotted:

Unpivotted

Pivotted:

Pivotted


M Code:

let
    Source = Folder.Files("C:\Users\Computer\test"),
    BinToTable = Table.AddColumn(Source, "Custom", each Table.PromoteHeaders(Csv.Document([Content]))),
    #"Removed Other Columns" = Table.SelectColumns(BinToTable,{"Custom"}),
    #"Added Custom" = Table.Combine(#"Removed Other Columns"[Custom]),
    #"Unpivoted Columns" = Table.UnpivotOtherColumns(#"Added Custom", {"ID"}, "Attribute", "Value"),
    #"Pivoted Column" = Table.Pivot(#"Unpivoted Columns", List.Distinct(#"Unpivoted Columns"[Attribute]), "Attribute", "Value")
in
    #"Pivoted Column"

It's not super elegant, but it should be fairly robust since we never use the names or number of the columns in the underlying files except for ID.