0
votes

I have been playing around quite a bit with powerquery to import all kinds of files and transform them in usable tables in excel. But recently I stumbled upon a problem I don't seem to find a solution for.

After some steps in my query I finally have the following result :

<table border="1">
 <tr><th>Column 1</th><th>Column 2</th></tr>
 <tr><td>Country</td><td>France</td></tr>
 <tr><td>Number of hits</td><td>10</td></tr>
 <tr><td>Number of misses</td><td>5</td></tr>
 <tr><td>Country</td><td>UK</td></tr>
 <tr><td>Number of hits</td><td>150</td></tr>
 <tr><td>Number of misses</td><td>35</td></tr>
 <tr><td>Country</td><td>Japan</td></tr>
 <tr><td>Number of hits</td><td>5</td></tr>
 <tr><td>Country</td><td>China</td></tr>
 <tr><td>Number of hits</td><td>50</td></tr>
 <tr><td>Number of misses</td><td>3</td></tr>
 </table>

So, normally I always expected each time a line with "Country", followed by a line with "hits" followed by a line with "misses" (and lots of other lines in between that i don't need and filtered out in previous steps).

But it now turns out that for countries where the number of misses = 0 there is not even a row available in the input file. In my example this is the line for "Japan"

My idea was to find some kind of function that does this :

IF < Column1 of current line > ='Number of hits' AND < Column1 of next line > ='Country' THEN insert ['Number of misses','0'] after the current line.

The end result should then look like this :

<table border="1">
 <tr><th>Column 1</th><th>Column 2</th></tr>
 <tr><td>Country</td><td>France</td></tr>
 <tr><td>Number of hits</td><td>10</td></tr>
 <tr><td>Number of misses</td><td>5</td></tr>
 <tr><td>Country</td><td>UK</td></tr>
 <tr><td>Number of hits</td><td>150</td></tr>
 <tr><td>Number of misses</td><td>35</td></tr>
 <tr><td>Country</td><td>Japan</td></tr>
 <tr><td>Number of hits</td><td>5</td></tr>
 <tr><td><b>Number of misses</b></td><td><b>0</b></td></tr>
  <tr><td>Country</td><td>China</td></tr>
 <tr><td>Number of hits</td><td>50</td></tr>
 <tr><td>Number of misses</td><td>3</td></tr>
 </table>

All of this should happen in the (advanced) query part and not once the data is imported in a table in excel.

1

1 Answers

0
votes

See of this works for you. Assumes source data is a table named Table1 You can paste into powerquery in home .. advanced editor...

What it does (a) creates a table of unique descriptions from Column 1 (b) creates a table of unique Country names from Column 2 (c) creates all combinations of the two (d) merges original data back into that, and anything missing automatically becomes a null we can replace with a zero

let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],

// create table of all unique descriptions
#"Removed Duplicates" = Table.Distinct(Source, {"Column 1"}),
#"Select" = Table.FromList(Table.Column(#"Removed Duplicates" ,"Column 1")),
#"Added Custom1" = Table.AddColumn(#"Select", "Custom", each 1),

// create table of all unique countries
#"Filtered Rows" = Table.SelectRows(Source, each ([Column 1] = "Country")),
#"Select2" = Table.FromList(Table.Column(#"Filtered Rows" ,"Column 2")),
#"Removed Duplicates1" = Table.Distinct(Select2),
#"Added Custom" = Table.AddColumn(#"Removed Duplicates1", "Custom", each 1),

// create all possible combinations of description x countries
#"Merged Queries" = Table.NestedJoin(#"Added Custom",{"Custom"},#"Added Custom1" ,{"Custom"},"Table3",JoinKind.FullOuter),
#"Expanded Table3" = Table.ExpandTableColumn(#"Merged Queries", "Table3", {"Column1"}, {"Column1.1"}),

// create table to compare to
#"Added Custom2" = Table.AddColumn(Source, "Fill", each if [Column 1]="Country" then [Column 2] else null),
#"Filled Down" = Table.FillDown(#"Added Custom2",{"Fill"}),

// merge compare table onto combination table. Anything missing comes up as null. Then replace nulls with 0
#"Merged Queries1" = Table.NestedJoin(#"Expanded Table3",{"Column1.1", "Column1"},#"Filled Down",{"Column 1", "Fill"},"Table4",JoinKind.LeftOuter),
#"Expanded Table4" = Table.ExpandTableColumn(#"Merged Queries1", "Table4", {"Column 2"}, {"Column 2"}),
#"Replaced Value" = Table.ReplaceValue(#"Expanded Table4",null,0,Replacer.ReplaceValue,{"Column 2"}),
#"Removed Columns" = Table.RemoveColumns(#"Replaced Value",{"Column1", "Custom"})

in  #"Removed Columns"