1
votes

I have a flattened crosstab in Power Query that contains duplicate rows, with values for column 1 & 2 reversed. I'd like to remove the duplicates within Power Query so that I'm left with unique rows only.

enter image description here

I can achieve this within Excel using COUNTIF, but I was hoping to find a Power Query oriented solution. So far I've tried creating two new columns in Power Query (newCol1 = Col1+2, newCol2 = Col2+1), and matching them. However this only gives me the rows where Col1=Col2.

enter image description here

I feel like this should be straightforward but I'm just not getting it despite finding similar posts here and on other places online. Any help is greatly appreciated!

1
Seems to me your fixing a symptom and not the problem. Does the source data truly contain both entries? or were both entries caused by a processing problem in the data extract? I've seen this happen often when a = is used on a table join in SQL when a > or < should be used.xQbert
The source data is a very large crosstab. I was looking for a good way of flattening a crosstab that was imported to Excel from another program, and using Power Query: "From Table-->Unpivot Other Columns" was suggested.duncan.idahopotato

1 Answers

0
votes

I think @xQbert has a good point. You probably want to try to fix this upstream.

However, if you do want to fix it at the point, then I suggest the following:

Create a custom column that puts these in alphabetical order. You can use a formula like this:

if [Col 1] < [Col 2] then [Col 1] & [Col 2] else [Col 2] & [Col 1]

Select the newly created column (only) and click Remove Rows > Remove Duplicates.

Delete the new column if desired.