0
votes

I'm working with the American Community Survey in Power BI and want to replace values in every column with corresponding values in another table of response labels. For example, race is encoded as 1,2,3,4,5 but I want to replace it with Asian, Black, Native, etc according to the response labels provided.

Let's say I have the following table with three variables and peoples' responses:

variable1 variable2 variable3
1 2 3
2 3 2
3 1 1
3 3 2
2 2 3

I am provided with this table of response keys:

VarName ResponseKey ResponseLabel
variable1 1 blue
variable1 2 red
variable1 3 green
variable2 1 left
variable2 2 right
variable2 3 down
variable3 1 high
variable3 2 medium
variable3 3 low

What I want is those three variables with the peoples' responses as the 'ResponseLabel' like so:

variable1 variable2 variable3
blue right low
red down medium
green left high
green down medium
red right low

Usually I would go one by one and replace each variable by hand but I would rather have a root canal than do that for hundred plus variables with anywhere from 2-100 responses so I imagine there is a better way to do this.

So far, I've thought about making tables for each variable and merge with the original table but that sounds like a lot as well. Then I thought maybe I should write a function to iterate across the original table and recode each column one by one. I'm also thinking there might be a way to do this in M but I'm not sure.

Do you have any ideas? Thanks!

1
Would not it be easier to make a joins instead of replacing?Michal Palko
Do you mean join as in merge the tables in PQ or join as in create a relationship between the tables? thx edit: To clarify, there are hundred+ columns and each column can have a hundred difference response options.verstehenmonk
I think you could just unpivot those hundred+ columns into single one and perform join only once.Michal Palko
This is the way. Unpivot both tables, merge with left outer join, expand, done.horseyride
I tried the joins but I don't think I unpivoted my tables correctly. original table Here is the labels Here is the merge of the two And here are the results expanded, which are disordered and I'm not sure how to get this back into a useable format. Where did I go wrong?verstehenmonk

1 Answers

0
votes

Assume your Response Keys table is Table1 and loaded into Powerquery

For the top table, load into PowerQuery

Add column ... index column ...

Right click new index column and ... Unpivot other columns ...

Home .. Merge Queries ...

Set bottom table to Table1, and click to match Attribute with VarName and Value with ResponseKey, with left outer join

enter image description here

Click the arrows atop the new column and [x] expand ResponseLabel

Right click value column and Remove

Click select attribute column, then Transform .. pivot columns ... value = ResponseLabel and Advanced=dont aggregate

enter image description here

Right click index column and Remove

Done

Full code:

let Source = Excel.CurrentWorkbook(){[Name="Table2"]}[Content],
#"Added Index" = Table.AddIndexColumn(Source, "Index", 0, 1),
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Added Index", {"Index"}, "Attribute", "Value"),
#"Merged Queries" = Table.NestedJoin(#"Unpivoted Other Columns",{"Attribute", "Value"},Table1,{"VarName", "ResponseKey"},"Table1",JoinKind.LeftOuter),
#"Expanded Table1" = Table.ExpandTableColumn(#"Merged Queries", "Table1", {"ResponseLabel"}, {"ResponseLabel"}),
#"Removed Columns" = Table.RemoveColumns(#"Expanded Table1",{"Value"}),
#"Pivoted Column" = Table.Pivot(#"Removed Columns", List.Distinct(#"Removed Columns"[Attribute]), "Attribute", "ResponseLabel"),
#"Removed Columns1" = Table.RemoveColumns(#"Pivoted Column",{"Index"})
in #"Removed Columns1"