1
votes

I'm trying to merge 2 queries in Power BI Desktop, matching rows based off a user and date column in one query to a row in the other query, where the user matches and the date in the 2nd query is the closest one before the date in the 1st query.

In other scenarios I need to match on more than one column, I'll usually create a composite key to match, but here's it's not a direct match.

Examples of the 2 queries are:

QUERY1

User     Activity     Activity Date
User 1   Activity 1   2019-01-24
User 1   Activity 2   2019-03-03
User 1   Activity 3   2019-04-17

QUERY2

User     Status     Status Change Date
User 1   Status 1   2019-02-05
User 1   Status 2   2019-03-06
User 1   Status 3   2019-04-05

And the merged query I'm looking for is:

MERGED QUERY

User     Activity     Activity Date   Status
User 1   Activity 1   2019-01-24       
User 1   Activity 2   2019-03-03      Status 1
User 1   Activity 3   2019-04-17      Status 3

Both queries are sourced from a REST API. If it was a SQL source, I'd use a SQL query to create a derived island table of start and stop dates based on Query2 and do a BETWEEN join against Query1 and have that be the source for Power BI.

Within the Power Query Editor, how would I get to the merged query result?

1

1 Answers

0
votes

First, you want to do as you suggested and modify the status table to have start and stop dates instead of Status Change Date. You can do this by sorting, indexing, and self-merging as I've previously explained here and here.

Once you have that, you can load a copy of the status table in each row and use the User and Date columns to filter the table and finally return a single value for Status.

let
    Source = <Query1 Source>
    #"Added Custom" =
    Table.AddColumn(Source, "Status",
        (C) => List.First(
                   Table.SelectRows(Status, 
                       each [User] = C[User] and 
                            [Start] < C[Date] and
                            ([Stop] = null or C[Date] <= [Stop])
                   )[Status]
               ),
        type text)
in
    #"Added Custom"

This says we take the Status table and filter it so that based on the current row the User matches and the Date is between Start and Stop. From that filtered table, we select the Status column, which is a list data type, so we pick the first element of the list to get the text value of the only member of the list.