2
votes

Using Power Query in Microsoft Excel 2013, I created a table that looks like this:

// To insert this in Power Query, append a '=' before the 'Table.FromRows'
Table.FromRows(
  {
    {"0", "Tom", "null", "null"},
    {"1", "Bob", [ name="Berlin" , street="BarStreet" ], [ name="Mary", age=25 ]},
    {"2", "Jim", [ name="Hamburg", street="FooStreet" ], [ name="Marta", age=30 ]}
  },
  {"ID", "Name", "Address", "Wife"}
)

Now, I want to expand the columns Address and Wife by using the name attribute on both records. Manually, I would do it like this:

// To insert this in Power Query, append a '=' before the 'Table.FromRows'
let
  t = Table.FromRows(
    {
      {"0", "Tom", "null", "null"},
      {"1", "Bob", [ name="Berlin" , street="BarStreet" ], [ name="Mary", age=25 ]},
      {"2", "Jim", [ name="Hamburg", street="FooStreet" ], [ name="Marta", age=30 ]}
    },
    {"ID", "Name", "Address", "Wife"}
  ),
  expAddress = Table.ExpandRecordColumn(t, "Address", {"name"}, {"Address → name"}),
  expWife = Table.ExpandRecordColumn(expAddress, "Wife", {"name"}, {"Wife → name"})
in
  expWife

Background

Whenever I have data tables that have a different layout, I need to rewrite the query. In a fantasy world, you could expand all columns that have Records in them using a specific key. Ideally, you would have the following library functions:

// Returns a list with the names of the columns that match the secified type.
// Will also try to infer the type of a column if the table is untyped.
Table.ColumnsOfTypeInfer(
  table as table,
  listOfTypes as list
) as list

// Expands a column of records into columns with each of the values.
Table.ExpandRecordColumnByKey(
  table as table,
  columns as list,
  key as text,
) as table

Then, I could call

// To insert this in Power Query, append a '=' before the 'Table.FromRows'
let
  t = Table.FromRows(
    {
      {"0", "Tom", "null", "null"},
      {"1", "Bob", [ name="Berlin" , street="BarStreet" ], [ name="Mary", age=25 ]},
      {"2", "Jim", [ name="Hamburg", street="FooStreet" ], [ name="Marta", age=30 ]}
    },
    {"ID", "Name", "Address", "Wife"}
  ),
  recordColumns = Table.ColumnsOfTypeInfer(t, {type record}),
  expAll = Table.ExpandRecordColumnByKey(t, recordColumns, "name")
in
  expAll

Question

  1. Can you get a list of columns with a specific type that is not specified in the table, aka infer it?
  2. Can you make that record expansion generic?

Edit: Added row #0 with two null cells.

2

2 Answers

3
votes

(First off, thanks for the clear explanation and sample data and suggestions!)

1) There's no way in M code to do type inference. This limitation might almost be considered a "feature", because if the source data changes in a way that causes the inferred type to be different, it will almost certainly break your query.

Once you load your untyped data, it should be quick to use the Detect Data Type button to generate the M for this. Or if you are reading data from JSON it should be mostly typed enough already.

If you have a specific scenario where this doesn't work want to update your question? :)

2) It's very possible and only a little convoluted to make the record expansion generic, as long as the cell values of the table are records. This finds columns where all rows are either null or a record and expands the name column.

Here's some simple implementations you can add to your library:

let
  t = Table.FromRows(
    {
      {"0", "Tom", null, null},
      {"1", "Bob", [ name="Berlin" , street="BarStreet" ], [ name="Mary", age=25 ]},
      {"2", "Jim", [ name="Hamburg", street="FooStreet" ], [ name="Marta", age=30 ]}
    },
    {"ID", "Name", "Address", "Wife"}
  ),

  Table.ColumnsOfAllRowType = (table as table, typ as type) as list => let
    ColumnNames = Table.ColumnNames(table),
    ColumnsOfType = List.Select(ColumnNames, (name) => 
      List.AllTrue(List.Transform(Table.Column(table, name), (cell) => Type.Is(Value.Type(cell), typ))))
  in
    ColumnsOfType,

  Table.ExpandRecordColumnByKey = (table as table, columns as list, key as text) as table  => 
    List.Accumulate(columns, table, (state, columnToExpand) => 
      Table.ExpandRecordColumn(state, columnToExpand, {key}, { columnToExpand & " → " & key })),

  recordColumns = Table.ColumnsOfAllRowType(t, type nullable record),
  expAll = Table.ExpandRecordColumnByKey(t, recordColumns, "name")
in
  expAll

If a new library function can be implemented in just M we're less likely to add it to our standard library, but if you feel it is missing feel free to suggest it at: https://ideas.powerbi.com/forums/265200-power-bi/

You might have a good argument for adding something like Table.ReplaceTypeFromFirstRow(table as table) as table, because constructing the type with M is very messy.

1
votes

Sorry to come to this a bit late, but I just had a similar challenge. I tried using Chris Webb's ExpandAll function:

http://blog.crossjoin.co.uk/2014/05/21/expanding-all-columns-in-a-table-in-power-query/

... but that only works on Table-type columns, not Record-type columns, but I have managed to hack it to that purpose. I duplicated Chris' function as "ExpandAllRecords" and made 3 edits: :

  1. replaced each if _ is table then Table.ColumnNames(_) with each if _ is record then Record.FieldNames(_)
  2. replaced Table.ExpandTableColumn with Table.ExpandRecordColumn
  3. replaced ExpandAll with ExpandAllRecords

I tried getting both tables and records expanding in one function, but I kept getting type errors.

Anyway, with that in place, the final query is just:

let
    t = Table.FromRows(
        {
            {"1", "Tom", null, [ name="Jane", age=35 ]},
            {"2", "Bob", [ name="Berlin" , street="BarStreet" ], [ name="Mary", age=25 ]},
            {"3", "Jim", [ name="Hamburg", street="FooStreet" ], [ name="Marta", age=30 ]}
        },
        {"ID", "Name", "Address", "Wife"}
    ),
    Output = ExpandAllRecords(t)
in
    Output

Edit:

Out of concern that that one day the great snippet (by Chris Webb, mentioned by @MikeHoney) will one day disappear), I'll mirror the entire code here:

let
     //Define function taking two parameters - a table and an optional column number 
     Source = (TableToExpand as table, optional ColumnNumber as number) =>
     let
      //If the column number is missing, make it 0
      ActualColumnNumber = if (ColumnNumber=null) then 0 else ColumnNumber,

      //Find the column name relating to the column number
      ColumnName = Table.ColumnNames(TableToExpand){ActualColumnNumber},

      //Get a list containing all of the values in the column
      ColumnContents = Table.Column(TableToExpand, ColumnName),

      //Iterate over each value in the column and then
      //If the value is of type table get a list of all of the columns in the table
      //Then get a distinct list of all of these column names
      ColumnsToExpand = List.Distinct(List.Combine(List.Transform(ColumnContents, 
                         each if _ is table then Table.ColumnNames(_) else {}))),

      //Append the original column name to the front of each of these column names
      NewColumnNames = List.Transform(ColumnsToExpand, each ColumnName & "." & _),

      //Is there anything to expand in this column?
      CanExpandCurrentColumn = List.Count(ColumnsToExpand)>0,

      //If this column can be expanded, then expand it
      ExpandedTable = if CanExpandCurrentColumn 
                          then 
                          Table.ExpandTableColumn(TableToExpand, ColumnName, 
                                 ColumnsToExpand, NewColumnNames) 
                          else 
                          TableToExpand,
      //If the column has been expanded then keep the column number the same, otherwise add one to it
      NextColumnNumber = if CanExpandCurrentColumn then ActualColumnNumber else ActualColumnNumber+1,

      //If the column number is now greater than the number of columns in the table
      //Then return the table as it is
      //Else call the ExpandAll function recursively with the expanded table
      OutputTable = if NextColumnNumber>(Table.ColumnCount(ExpandedTable)-1) 
                         then 
                         ExpandedTable 
                         else 
                         ExpandAll(ExpandedTable, NextColumnNumber)
     in
     OutputTable
in
     Source

You can then use this function on the XML file as follows:

let
     //Load XML file
     Source = Xml.Tables(File.Contents("C:\Users\Chris\Documents\PQ XML Expand All Demo.xml")),
     ChangedType = Table.TransformColumnTypes(Source,{{"companyname", type text}}),

     //Call the ExpandAll function to expand all columns
     Output = ExpandAll(ChangedType)
in
     Output  

(Source and downloadable example: Chris Webb's Bi Blog, 2014-05-21)