DAX Measure for Dynamic Model Evaluation

Question

Question

What is the most elegant & performant way to define DAX formula calculating the values of a confusion matrix (TP, FP, FN, TN) so that another measure (e.g. Precision, Recall, F1) can make use of them?

Background

We are serving a classification model to users. Users have expressed desire to see how model evaluation metrics for a given subset of historical predictions.

My approach of one measure for each of the four values strikes me as inefficient... Originally, I had a "What-If?" parameter for threshold so the "predicted" column was a measure and calculated columns/tables were ruled out. I'm thinking there's a potential use case for SUMMARIZE()...

Current approach

True Positive  := CALCULATE ( DISTINCTCOUNT ( Eval[id] ), Eval[actual] = 1, Eval[predicted] = 0 )
False Negative := CALCULATE ( DISTINCTCOUNT ( Eval[id] ), Eval[actual] = 1, Eval[predicted] = 0 )
False Negative := CALCULATE ( DISTINCTCOUNT ( Eval[id] ), Eval[actual] = 1, Eval[predicted] = 0 )
False Negative := CALCULATE ( DISTINCTCOUNT ( Eval[id] ), Eval[actual] = 1, Eval[predicted] = 0 )
Precision      := DIVIDE([True Positive],[True Positive]+[False Positive])
Recall         := DIVIDE([True Positive],[True Positive]+[False Negative])
F1 Score       := 2*DIVIDE([Precision]*[Recall],[Precision]+[Recall])

Example source data

| id | predicted | actual |
|----|-----------|--------|
| 1  | 0         | 0      |
| 2  | 0         | 0      |
| 3  | 0         | 1      |
| 4  | 1         | 1      |
| 5  | 1         | 1      |
| 6  | 1         | 0      |
| 7  | 1         | 0      |

Source query

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUTIA41idaCUjFJ4xlGcI5pmAWTCeKQrPDMqD6DNH8GIB", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [id = _t, predicted = _t, actual = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"id", Int64.Type}, {"predicted", Int64.Type}, {"actual", Int64.Type}})
in
    #"Changed Type"

This is a really open question. I will try to have more feedbacks: (1) do you need to show the confusion matrix for a specific threshold? (2) do you need to plot the ROC? — Seymour
@Seymour. (1) confusion matrix not really needed... Just F1 I think (though ultimately dependent on what users want) (2) i've plotted the ROC already and 0.5 is best, so no need for ROC in report — Anders Swanson
Ok, now is clear you were missing the verb in the Question sentence. — Seymour

Seymour Seymour · Accepted Answer · 2020-01-03T19:47:22

In general the most elegant and performing way is not a simple line of DAX, instead it is the general approach you used to design your data model.

The general rule for optimization is: move to back-end as much as possible.

Having said this, you will always find yourself in a trade-off between query performance and total memory of your data model. If total memory is not a problem, the best practice defined by Microsoft is to perform sum of individual columns.

Here is my approach:
1. Modify M-query to add 4 new columns we will use as field to aggregate. In general, do prefer modifying the data source instead of creating Calculated Columns. The resulting data source should be the following:

+----+-----------+--------+-------+-------+-------+-------+
| id | predicted | actual | TP_fl | TN_fl | FP_fl | FN_fl |
+----+-----------+--------+-------+-------+-------+-------+
|  1 |         0 |      0 |     0 |     1 |     0 |     0 |
|  2 |         0 |      0 |     0 |     1 |     0 |     0 |
|  3 |         0 |      1 |     0 |     0 |     0 |     1 |
|  4 |         1 |      1 |     1 |     0 |     0 |     0 |
|  5 |         1 |      1 |     1 |     0 |     0 |     0 |
|  6 |         1 |      0 |     0 |     0 |     1 |     0 |
|  7 |         1 |      0 |     0 |     0 |     1 |     0 |
+----+-----------+--------+-------+-------+-------+-------+

Next, let's define the base measures:

[TP] := SUM('Eval'[TP_fl]) [TN] := SUM('Eval'[TN_fl]) [FP] := SUM('Eval'[FP_fl]) [FN] := SUM('Eval'[FN_fl])

Please note the following measures are based on the assumption Eval[Id] is the Primary Key of Eval table, as in your example.

Finally, let's define complex measures which are calculated using above base measures:

[Precision] := VAR _TP = [TP] RETURN DIVIDE(_TP,_TP +[FP]) [Recall] := VAR _TP = [TP] RETURN DIVIDE(_TP,_TP + [FN]) [F1 Score] := VAR _TP = [TP] VAR _FP = [_FP] VAR FN_ = [_FN] VAR _PRECISION = DIVIDE(_TP,_TP +[FP]) VAR _RECALL = DIVIDE(_TP,_TP + [FN]) RETURN 2*DIVIDE(_PRECISION * _RECALL, _PRECISION + _RECALL)

Two important remarks:

In complex measures, the use of variables is key to avoid wasting computing time.
This optimal answer is against software engineering principle of redundant code. As explained there is always a trade-off. In this scenario, mathematical formulas won't change.