0
votes

I've recently run into an issue which hopefully is solvable.

Currently, I have power query pointing at a folder containing several CSV files. This is normally no issue, however, in this instance not all of the files have the same columns.

Is there a way to have power query return every unique column found in the folder populating empty data observations with null values?

Assume that my folder has csv files similar to the following (note that the rows are indexed using letters for easy reference):

enter image description here

I would like my final table to look something like:

enter image description here

This seems like it should be pretty simple, but I can't figure it out for the life of me! Any help would be greatly appreciated!

2

2 Answers

0
votes

Assuming you're using Folder.Files, I think you can:

  1. Grab the Content column of the table returned by Folder.Files -- which should give you a list of binary values.
  2. Parse each item in the list as a CSV document using List.Transform and Csv.Document -- which should give you a list of tables.
  3. Then merge your list of tables with Table.Combine -- which should give you one single table. Table.Combine should take care of the details (like aligning column names).

You've not provided any code in your question, so it's hard to give a relevant example, but I think the code below gives me your expected output.

I've turned the row indexes into an ID column, just to make the final table easier to verify/follow.

let
    firstCsv = 
"ID,one,two,three
A,1,4,7
B,2,5,8
C,3,6,9",
    secondCsv = 
"ID,one,two,three,four
D,1,6,11,16
E,2,7,12,17
F,3,8,13,18
G,4,9,14,19
H,5,10,15,20",
    thirdCsv =
"ID,one,two,yes,no,maybe
I,1,1,1,1,1
J,2,2,2,2,2
K,3,3,3,3,3
L,4,4,4,4,4
M,5,5,5,5,5",


    // For example's sake, let's suppose that the contrived table below was 
    // returned by calling Folder.Files
    filesInFolder = Table.FromColumns({
        List.Transform({firstCsv, secondCsv, thirdCsv}, Text.ToBinary),
        List.Transform({"1".."3"}, each "CSV file " & _ & ".csv"),
        List.Repeat({"someFolderPath"}, 3)
    }, type table [Content = binary, Name = text, Folder = text]),
    parsed = List.Transform(filesInFolder[Content], each 
        let
            csv = Csv.Document(_, [Delimiter = ",", QuoteStyle = QuoteStyle.Csv]),
            promoted = Table.PromoteHeaders(csv, [PromoteAllScalars = true])
        in promoted
    ),
    // The step below should match the expected output in your question.
    combined = Table.Combine(parsed)
in
    combined

Obviously, you'll need to adjust for your own folder path and actually call Folder.Files as you presumably already are in your own code.

0
votes

I've always used something like this

//read all files in specified directory you fill in here
let Source = Folder.Files("C:\directory\subdirectory"),
//filter only csv files
#"Filtered Rows" = Table.SelectRows(Source, each ([Extension] = ".csv")),
//Pull contents of each file into table with an index
#"Added Custom1" = Table.AddColumn(#"Filtered Rows", "Custom", each Table.AddIndexColumn(Csv.Document(File.Contents([Folder Path]&"\"&[Name]),[Delimiter=",", Encoding=1252,     QuoteStyle=QuoteStyle.None]),"Index")),
#"Expanded Custom" = Table.ExpandTableColumn(#"Added Custom1", "Custom", {"Column1", "Index"}, {"Column1", "Index"}),
#"Removed Other Columns" = Table.SelectColumns(#"Expanded Custom",{"Column1", "Index", "Name"}),
#"Pivoted Column" = Table.Pivot(#"Removed Other Columns", List.Distinct(#"Removed Other Columns"[Name]), "Name", "Column1"),
#"Removed Columns" = Table.RemoveColumns(#"Pivoted Column",{"Index"})
in #"Removed Columns"