1
votes

I need to create a dimension table from the current "Data Set" in order to be used in Excel and Power Bi. As for the moment the information has been saved in an excel workbook.

I've simplified a workbook to set it up as an example. Here's the Link to Download it: https://1drv.ms/x/s!AkyEWh5MlySFoWOFMTC3GDXY09eS

Currently, everything has been structured as a Pivot Table, where the column headers are dates, like so:

+----+----------+----------+----------+
| ID | 12/29/14 | 01/05/15 | 01/12/15 |
+----+----------+----------+----------+
| 00 | Darell   | Darell   | Darell   |
| 01 | Jarrod   | Annelle  | Todd     |
| 04 | Lakeesha | Linn     | Linn     |
+----+----------+----------+----------+

I've un-pivot the table and grouping the ID and name field plus calculating the minimum and maximum date value. Works like charm.

Problem comes with the last date, since this will provide me the max value for the local grouping, and in reality the last day would be one day before the next column on the original data, like so:

Here is an example:

+----+----------+----------+----------+
| ID |   Name   |   From   |    To    |
+----+----------+----------+----------+
| 00 | Darell   | 12/29/14 | 01/05/15 |
| 00 | Kamala   | 01/12/15 | ...      |
| 01 | Jarrod   | 12/29/14 | 12/29/14 |
| 01 | Annelle  | 01/05/15 | 01/05/15 |
| 01 | Todd     | 01/12/15 | ....     |
| 04 | Lakeesha | 12/29/14 | 12/29/14 |
| 04 | Linn     | 01/05/15 | ....     |
| .. | ........ | ....     | ....     |
+----+----------+----------+----------+

This is a simplified code that I've created to this:

let
Source = Excel.CurrentWorkbook(){[Name="RawData"]}[Content],

#"Unpivot [Table]" = Table.UnpivotOtherColumns(Source, {"ID"}, "Date", "Name"),

#"Group [Table]" = Table.Group(#"Unpivot [Table]", {"ID", "Name"}, {{"From", each List.Min([Date]), type date}, {"On", each List.Max([Date]), type date}}, GroupKind.Local)

in
#"Group [Table]"

Here is what I'm expecting to get as a result, each name grouped locally with a start and end date.

+----+----------+----------+----------+
| ID |   Name   |   From   |    To    |
+----+----------+----------+----------+
| 00 | Darell   | 12/29/14 | 01/11/15 |
| 00 | Kamala   | 01/12/15 | ....     |
| 01 | Jarrod   | 12/29/14 | 01/04/15 |
| 01 | Annelle  | 01/05/15 | 01/11/15 |
| 01 | Todd     | 01/12/15 | ....     |
| 04 | Lakeesha | 12/29/14 | 01/04/15 |
| 04 | Linn     | 01/05/15 | ....     |
| .. | ........ | ....     | ....     |
+----+----------+----------+----------+

with this I will generate each date in between "From" and "To" with

Table.AddColumn(#"Calculate [To]", "Date", each { Number.From([From])..Number.From([To]) })
1
What about using function Date.AddDays? docs.microsoft.com/en-us/powerquery-m/date-adddays You can substract days by adding negative number of days - Sergey Lossev

1 Answers

0
votes

Start from where you at in your sample worksheet,

  1. Add an Index Column start from 0 with 1 per increment;
  2. Duplicate the query, and change the Index Column step to start from 1 with 1 per increment;
  3. Merge the second query with the first query. In the Merge window, holding down Ctrl key on your keyboard and select ID and Index column consecutively in both queries as shown below (please note in my screenshot RowData (3) is the duplicated query while RowData (2) is the original query):

Merge

  1. Expand the merged column to show From date and Use original column name as prefix;
  2. Add a Custom Column with the following formula: =if [#"RawData.From"]=null then [On] else Date.AddDays([#"RawData.From"],-1)
  3. Remove irrelevant columns as desired and sort the table (optional step).

You can then add another custom column to find the days between From and To per ID per Name.


EDIT #2

You can actually add two Index columns to the original query instead of making a duplicated query. Just make sure to sort the query by ID by From and then add the Index columns one start from 0 and the other start from 1, then merge the query with itself by selecting the correct columns as shown below and the rest of the steps are the same.

Demo 2.0

Here are the M codes behind the scene for your reference.

let
    Source = Excel.CurrentWorkbook(){[Name="RawData"]}[Content],
    #"Filter [null]" = Table.SelectRows(Source, each [ID] <> null and [ID] <> ""),
    #"Remove [Duplicates]" = Table.Distinct(#"Filter [null]", {"ID"}),
    #"Unpivot [Table]" = Table.UnpivotOtherColumns(#"Remove [Duplicates]", {"ID"}, "Date", "TM"),
    #"Detect Data Type" = Table.TransformColumnTypes(#"Unpivot [Table]",{{"ID", Int64.Type}, {"Date", type date}, {"TM", type text}}),
    #"Group [Table]" = Table.Group(#"Detect Data Type", {"ID", "TM"}, {{"From", each List.Min([Date]), type date}, {"On", each List.Max([Date]), type date}}, GroupKind.Local),
    #"Sorted Rows" = Table.Sort(#"Group [Table]",{{"ID", Order.Ascending}, {"From", Order.Ascending}}),
    #"Added Index" = Table.AddIndexColumn(#"Sorted Rows", "Index", 0, 1),
    #"Added Index1" = Table.AddIndexColumn(#"Added Index", "Index.1", 1, 1),
    #"Merged Queries" = Table.NestedJoin(#"Added Index1", {"ID", "Index.1"}, #"Added Index1", {"ID", "Index"}, "Added Index1", JoinKind.LeftOuter),
    #"Expanded Added Index1" = Table.ExpandTableColumn(#"Merged Queries", "Added Index1", {"From"}, {"Added Index1.From"}),
    #"Added Custom" = Table.AddColumn(#"Expanded Added Index1", "To", each if [Added Index1.From]=null then [On] else Date.AddDays([Added Index1.From],-1)),
    #"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"On","Index", "Index.1", "Added Index1.From"})
in
    #"Removed Columns"

Let me know if you have any questions. Cheers :)