1
votes

What would the equivalent DAX be for this SQL?

SELECT AVG(CountRows) FROM pbi.FactVend AS FV JOIN pbi.DimAsset AS DA ON DA.KEY_Asset = FV.KEY_Asset WHERE CAST(FV.KEY_VendDate AS Date) NOT BETWEEN DA.ExcludedFromDate AND DA.ExcludedToDate

I have two tables, Vend and Asset. I want to exclude the rows from Vend where the VendDate is in an excluded period. Something like below however I can't get the DAX right. If i filter Vend it cannot see the Asset columns also doesn't seem to like being supplied convert date, KEY_VendDate is Int YYYYMMDD...

Average Cup Vends = 
CALCULATE(AVERAGE(Vend[CountRows]), 
FILTER(Vend, NOT(DATESBETWEEN(CONVERT(Vend[KEY_VendDate], DATETIME), Asset[Excluded From Date], Asset[Excluded To Date])))

enter image description here

2
Can someone help me with this please.jhowe

2 Answers

3
votes

Things I don't understand but I assume:

  1. Assets[ProductKey] has unique values.

enter image description here

  1. Vends (which from now on I'll call Transactions) contains one column related to Assets[ProductKey]
  2. Transactions[Date] is a Date/Time column

enter image description here

  1. Transactions is related to a Dates Table, which is not related to Assets.

I don't know how you're trying to use the measurement, but I hope the following example can help you find the right path.

I excluded Assets[ProductKey] only on January, as you can see in the assets table image, ie:

  • Assets[ProductKey]=21 was excluded from 1/21/2021 00:00 to 1/22/2021 00:00,
  • Assets[ProductKey]=22 was excluded from 1/22/2021 00:00 to
    1/23/2021 00:00 and so on

You can access the columns in the expanded Transactions table through RELATED.

FILTER(Transactions,
NOT(AND(
    Transactions[TransactionDate]>=RELATED(Assets[ExcludedFromDate]),
    Transactions[TransactionDate]<RELATED(Assets[ExcludedToDate]))))

In my example. I used this:

AVGNonExcludedTransactions := 
VAR SMZDateContext=SUMMARIZE(CalendarDateTime,CalendarDateTime[Year],CalendarDateTime[MonthName])
VAR NonExcludedTransactions= 
FILTER(Transactions,
NOT(AND(
    Transactions[TransactionDate]>=RELATED(Assets[ExcludedFromDate]),
    Transactions[TransactionDate]<RELATED(Assets[ExcludedToDate]))))
VAR Result=
ADDCOLUMNS(SMZDateContext, "Count", CALCULATE(COUNTROWS(INTERSECT(
Transactions, NonExcludedTransactions))))
RETURN
AVERAGEX(Result,[Count])

enter image description here

... removing the highlighted rows. One day of exclusion for each [ProductKey] in the Assets table and more than one day of exclusion for each product in the Transactions table.

enter image description here

which can be analyzed by changing INTERSECT() to EXCEPT() and increasing granularity at the day level.

enter image description here

EDIT:

In this second part the objective is not to use FILTER on the Transactions table. However, I think the following approach can be improved by changing dates to numbers. And I still don't know if it's more efficient than using FILTER on a 10M row table. Probably not, because it would be necessary to have less than 100 products and more than 2M transactions

This is what the model looks like:

enter image description here

This time TCountR is a simpler measure:

TCountR = COUNTROWS(Transactions)

And the filter is calculated in another way. With a single DateTime column containing the exclusion period for each product within the granularity of CalendarDateTime:

AVGTCountRNonExcluded := 
VAR TotalRow = 
SUMMARIZE(CalendarDateTime,CalendarDateTime[Year],CalendarDateTime[MonthName])
VAR AllCJ =
CROSSJOIN(SUMMARIZE(Products,Products[ProductKey]),SUMMARIZE(CalendarDateTime,CalendarDateTime[DateTime]))
VAR Excluded=
SELECTCOLUMNS(
    GENERATE(Assets,
                    ADDCOLUMNS(
                                CROSSJOIN (
                        //Dates in Transactions should be rounded down at the hour level.
                        // -1 means that the day 1/2/2021 is not included
                        //From 1/1/2021 00:00 to 1/1/2021 23:00
                        ////
                        //Without adding or subtracting a value:
                        //From 1/1/2021 00:00 to 1/2/2021 23:00
                                            CALENDAR(Assets[ExcludedFromDate],Assets[ExcludedToDate]-1),
                                            SELECTCOLUMNS(GENERATESERIES(0,23,1),"Time",TIME([Value],0,0))),
                                "DateTime", [Date] + [Time])),
"ProductKey", Assets[ProductKey], "DateTime", [DateTime])
VAR FilteredOut=EXCEPT(AllCJ,Excluded)
VAR Result = ADDCOLUMNS(TotalRow,"Count", CALCULATE([TCountR],KEEPFILTERS(FilteredOut)))
RETURN
AVERAGEX(Result,[Count])

The result is the same.

enter image description here

EDIT 2

Why not?

If you already understood the 2nd approach, you may wonder, what if I can add a column to my Transactions table and change the [TransactionDate] format from DateTime to Date, and use a Dates Table only at the Date level. Example:

1/1/2021 23:00 To 1/1/2021 00:00
1/2/2021 00:00 To 1/2/2021 00:00

The code gets simpler:

AVGCountRowsDateLevel := 
VAR TotalRow= SUMMARIZE(Dates,Dates[Year],Dates[MonthName])
VAR AllCJ=CROSSJOIN(SUMMARIZE(Products,Products[ProductKey]),SUMMARIZE(Dates,Dates[Date]))
VAR Excluded=
SELECTCOLUMNS(
    GENERATE(Assets,
    DATESBETWEEN(Dates[Date],Assets[ExcludedFromDate],Assets[ExcludedToDate]-1)),
"ProductKey", Assets[ProductKey], "Date", [Date])
VAR FilteredOut=EXCEPT(AllCJ,Excluded)
VAR Result = ADDCOLUMNS(TotalRow,"Count", CALCULATE([TCountR],KEEPFILTERS(FilteredOut)))
RETURN
AVERAGEX(Result,[Count])

And the result is the same

As I said at the beginning, this is an example, which I hope can help you find the solution.

2
votes

Assuming you have a relationship from Assets[KEY_Asset] to Vend[KEY_Asset] and Vend[VendDate] is formatted as a date, then you can write

Average Cup Vends =
CALCULATE (
    AVERAGE ( Vend[CountRows] ),
    FILTER (
        Vend,
        NOT AND (
            Vend[VendDate] > RELATED ( Asset[Excluded From Date] ),
            Vend[VendDate] < RELATED ( Asset[Excluded To Date] )
        )
    )
)

This requires first defining a calculated column Vend[VendDate] to convert Vend[KEY_VendDate] from YYYYMMDD to a date format. You can define such a column as follows:

VendDate =
DATE (
    LEFT  ( Vend[KEY_VendDate], 4 ),
    MID   ( Vend[KEY_VendDate], 5, 2 ),
    RIGHT ( Vend[KEY_VendDate], 2 )
)

Another option is to convert the Asset date columns into integer format instead.

Average Cup Vends =
CALCULATE (
    AVERAGE ( Vend[Countrows] ),
    FILTER (
        Vend,
        NOT AND (
            Vend[KEY_VendDate]
              > VALUE ( FORMAT ( RELATED ( Asset[Excluded From Date] ), "yyyymmdd" ) ),
            Vend[KEY_VendDate]
              < VALUE ( FORMAT ( RELATED ( Asset[Excluded To Date] ), "yyyymmdd" ) )
        )
    )
)