0
votes

I have a column in my table that has some text values (input) which I would like to convert to numbers (output) for each unique text value, so that I can do some regression analysis:

Input Output
AOP 1
AOS 2
AOS 2
AOS 2
AOP 1
null 0 or null
AOP 1

I initially tried to do this do this with several Transform: Replace Values steps, however I don't know how to:

  • make this flexible to different numbers of unique values (not hardcode 3 replacements but handle n where n is the number of unique values in input)
  • repeat this for many columns of my table
  • avoid looping as far as possible

What's a better approach?

1

1 Answers

1
votes

One way is add custom column with below formula, and do that for each column you care to apply it to, using the value of each text character to generate a unique number

= try 
 List.Accumulate(Text.ToList([Input]), "", (state, current)=>
 state&Number.ToText(Character.ToNumber(current), "0000")) otherwise null

this would transform all column's text into unique numbers, replacing the original data:

let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
Function = (x) => try  List.Accumulate(Text.ToList(x), "", (state, current)=> state&Number.ToText(Character.ToNumber(current), "0000")) otherwise null,
TransformList = List.Transform(Table.ColumnNames(Source), each {_ , Function}),
Output = Table.TransformColumns(Source, TransformList)
in Output

this would transform all column's text into unique numbers, appending the new columns to existing columns:

let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
Function = (x) => try List.Accumulate(Text.ToList(x), "", (state, current)=> state&Number.ToText(Character.ToNumber(current), "0000")) otherwise null,
TransformList = List.Transform(Table.ColumnNames(Source), each {_ , Function}),
Output = Table.TransformColumns(Source, TransformList),
Numericals=Table.RenameColumns( Output, List.Zip( { Table.ColumnNames( Output), List.Transform(Table.ColumnNames(Output), each _ &"number") } ) ),
#"Merged Queries" = Table.NestedJoin(Table.AddIndexColumn(Source, "Index", 0, 1),{"Index"},Table.AddIndexColumn(Numericals, "Index2", 0, 1),{"Index2"},"Tabl2",JoinKind.LeftOuter),
#"Expanded Tabl2" = Table.ExpandTableColumn(#"Merged Queries", "Tabl2", Table.ColumnNames( Numericals),Table.ColumnNames( Numericals)),
#"Removed Columns" = Table.RemoveColumns(#"Expanded Tabl2",{"Index"})
in #"Removed Columns"