1
votes

For the following problem I found already a solution that works in PowerBI (special thanks to Marco Vos). Now, I realize that it would be more convenient for my purpose if this calculation would be already done by PowerQuery.

What I want to do:

A production line writes a time stamp into a CSV file when it starts an event (e.g. machine starts, machine stopps etc.). The CSV file is processed by PowerQuery in a way that all start dates are available in one column of a table. The column is called "Event Start".

To further analyze the events of the machine, I would like to add another column "Event End" that contains also the end date of each event. The end date is not provided by the CSV file but it is equal to the start date of the next event.

So what I need to do in PoweryQuery is to create a custom column which takes in each row the value of "start date" from the following row and place it into the current row.

What I have tried:

I tried to use the same principle as given by Marco Vos in my original thread: Sumif equivalent in PowerBI doesn't work properly with DAX function

BUT PowerQuery tells me that it doesn't know expressions like CALCULATE.

Can anybody support in this topic?

Edited:

To get closer to the solution and to show you the problem I currently face, I tried the following code with a very simple table: The table just contains a sequence number from 1 to 10.000 in column A ("ID"). In order to create column B ("ID-1") that should include the "sequence number - 1", I used the following code based on Ollys comment.

Please note: I cannot upload the table because every upload is blocked by our IT department.

let
    Source = Excel.Workbook(File.Contents("C:\Users\machine2321\Desktop\ID_sample.xlsx"), null, true),
    Sheet1_Sheet = Source{[Item="Sheet1",Kind="Sheet"]}[Data],
    #"Promoted Headers" = Table.PromoteHeaders(Sheet1_Sheet, [PromoteAllScalars=true]),
    #"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"ID", Int64.Type}}),
    #"Added Index" = Table.AddIndexColumn(#"Changed Type", "Index", 0, 1),
    #"Added ID-1" = Table.AddColumn(#"Added Index", "ID-1", each try #"Added Index"[ID]{[Index]+1} otherwise null),
    #"Removed Index" = Table.RemoveColumns(#"Added ID-1",{"Index"})
in
    #"Removed Index"

The problem that I recognize is:

Processing this simple table with PowerQuery in Excel takes already a few minutes. In the original query I process 2000+ files, each containing up to 1.000 events. By using the approach described above the original query runs for hours, aborts and returns the message "Unexpected error" at the end.

Any clue why the code requires such a big amount of processing performance?

1
Power Query uses different language called "M". It's not the same as DAX used by Power BI.RADO

1 Answers

2
votes

You can add an index column, which allows you to make reference to a value from a specific row, based on a relative index value:

#"Added Index" = Table.AddIndexColumn(#"Previous Step", "Index", 0, 1),
#"Added Event End" = Table.AddColumn(#"Added Index", "Event End", each try #"Added Index"[Event Start]{[Index]+1} otherwise null, type datetime),
#"Removed Index" = Table.RemoveColumns(#"Added Event End",{"Index"})

try ... otherwise is included so that an error is not returned for the last row.

EDIT:

As you've discovered, this technique is SLOW for large tables.

There's a faster technique, which joins up a 'shifted' table with the original. The full explanation is long, and covered in detail (with code) here:

https://www.thebiccountant.com/2018/07/12/fast-and-easy-way-to-reference-previous-or-next-rows-in-power-query-or-power-bi/