1
votes

I am trying to combine worksheets from two different workbooks with Power Query and I have trouble doing that.

I do not want to merge the two workbooks.

I do not want to create relationships or "joints".

However, I want to get very specific information for one workbook which has only one column. The "ID" column.

The ID column has rows with letter tags : AB or BE. Following these letters, sepcific numeric ranges are associated. For both AB and BE, number ranges first from 0000 to 3000 and from 3000 to 6000.

I thus have the following possibilities: From AB0000 to AB3000 From AB3001 to AB6000 From BE0000 to BE3000 From BE3001 to AB6000

Each category match to the a specific item in my column geography, from the other workbook: From AB0000 to AB3000, it is ItalyZ From AB3001 to AB6000, it is ItalyB From BE0000 to BE3000, it is UKY From BE3001 to AB6000, it is UKM

I am thus trying to find the highest number associated to the first AB category, the second AB category, the first BE category, and the second.

I then want to "bring" this number in the other query and increment it each time that matching country is found in the other workbook. For example :

AB356 is the highest number in the first workbook.

Once the first "ItalyB" is found, the column besides writes "AB357". Once the second is "ItalyB" is found, the column besides write "AB358".

Here is the one columned worksheet: ![input

Here is the other worksheet with the various countries in geography: input

Here is an example of results: result

have one column (geography) with I think that this is something which I should work towards:

I added the index column, with a start as one, because each row (even row zero) should increment either of the four matching code. enter image description here

In order to keep moving forward I have also been trying to create some sort of mapping in third excel sheet, that I imported in Power BI, but I am not sure that this is a good way forward:

mapping

I have the following result when I create a blank query: enter image description here

After a correction, I still get this result when creating the blank query: enter image description here

1
Good job, I gave you 10 points on the question because I know how much work you did. We have a bit of a problem here because we need to have a kind of order on the rows for our DAX. If you have 4 times ItalyZ, how do you know which one is firts? Without this we cannot get the result you want. Is there a date column or an ID? If not we will need to make in index in m-guery.Aldert
Thank you ! I hope to have enough mastery at some point to gie points and answer questions. I am loosing my mind with Power BI and my workbook. To answer the question: the order is not strongly important. I could have 4 times ItalyZ, or even 100 occurrences of ItalyZ. Whichever "appears" first in my "geogrpaphy" column should be incremented, going top down (first row), but the DAX or Power Query could as well "start" at very end of my Geography column and start increment from that point. So in, in order to "create" a "fake" order I should create a Index column after "geography" ?Tomas Michel

1 Answers

1
votes

This is not an easy answer as there are many steps to get to your result. I have choosen for m-query because of the complexity.

In PBi click on Transform data, now you are in m-query.

  1. The table with the ID's (I called it "HighestID") needs expansion because we need to be able to map on prefix
  2. You need a mapping table ("GeoMapping"), else there is no relation between the Prefixes and the geolocation.
  3. We need the newID on the Geo-table (which I called "Geo").

Expand the HighestID table.

Click on the table and open the Advanced Editor, look at your code and compare it to the one below, the last 2 steps are essential, there I add two columns (Prefix and Number) which we need later.

let
    Source = Csv.Document(File.Contents("...\HighestID.csv"),[Delimiter=";", Columns=1, Encoding=1252, QuoteStyle=QuoteStyle.None]),
    #"Promoted Headers" = Table.PromoteHeaders(Source, [PromoteAllScalars=true]),
    #"Changed Type1" = Table.TransformColumnTypes(#"Promoted Headers",{{"ID", type text}}),
    #"Added Custom" = Table.AddColumn(#"Changed Type1", "Prefix", each Text.Middle([ID],0,2), type text),
    #"Added Custom1" = Table.AddColumn(#"Added Custom", "Number", each Number.FromText(Text.Middle([ID],2,5)))
in
    #"Added Custom1"

Result: enter image description here

Create mapping table

Click right button under your last table and click Blank Query: enter image description here

Paste the source below, ensure the name of the merg table equals the name of your table. As I mentioned, I called it HighestID.

let
    Source = #table({"Prefix", "Seq_Start", "Seq_End","GeoLocation"},{{"AB",0,2999,"ItalyZ"},{"AB",3000,6000,"ItalyB"},{"BC",0,299,"UKY"},{"BC",3000,6000,"UKM"}}),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Seq_Start", Int64.Type}, {"Seq_End", Int64.Type}}),
    #"Merged Queries" = Table.NestedJoin(#"Changed Type", {"Prefix"}, HighestID, {"Prefix"}, "HighestID", JoinKind.LeftOuter),
    #"Expanded HighestID" = Table.ExpandTableColumn(#"Merged Queries", "HighestID", {"Number"}, {"Number"}),
    #"Filtered Rows" = Table.SelectRows(#"Expanded HighestID", each [Number] >= [Seq_Start] and [Number] <= [Seq_End]),
    #"Grouped Rows" = Table.Group(#"Filtered Rows", {"Prefix", "Seq_Start", "Seq_End", "GeoLocation"}, {{"NextSeq", each List.Max([Number]) + 1, type number}})
in
    #"Grouped Rows"

Result: enter image description here

Adding the NextSeq Column

This is the hard bit because when I would only give you teh code, I am afraid it will not work so I give you the steps you need to do.

1.Select the table, right click on Geography and click Group by. select as below: enter image description here

  1. Merge with table Geomapping as below: enter image description here

  2. Expand the GeoMapping with NextSeq enter image description here

  3. Add a custom column: enter image description here

  4. Remove columns not needed so only custom is left created in step 4.

  5. Expand the column (all select). End result all your columns you had earlier plus an Index column.

enter image description here