2
votes

Question

What is the most elegant & performant way to define DAX formula calculating the values of a confusion matrix (TP, FP, FN, TN) so that another measure (e.g. Precision, Recall, F1) can make use of them?

Background

We are serving a classification model to users. Users have expressed desire to see how model evaluation metrics for a given subset of historical predictions.

My approach of one measure for each of the four values strikes me as inefficient... Originally, I had a "What-If?" parameter for threshold so the "predicted" column was a measure and calculated columns/tables were ruled out. I'm thinking there's a potential use case for SUMMARIZE()...

Current approach

True Positive  := CALCULATE ( DISTINCTCOUNT ( Eval[id] ), Eval[actual] = 1, Eval[predicted] = 0 )
False Negative := CALCULATE ( DISTINCTCOUNT ( Eval[id] ), Eval[actual] = 1, Eval[predicted] = 0 )
False Negative := CALCULATE ( DISTINCTCOUNT ( Eval[id] ), Eval[actual] = 1, Eval[predicted] = 0 )
False Negative := CALCULATE ( DISTINCTCOUNT ( Eval[id] ), Eval[actual] = 1, Eval[predicted] = 0 )
Precision      := DIVIDE([True Positive],[True Positive]+[False Positive])
Recall         := DIVIDE([True Positive],[True Positive]+[False Negative])
F1 Score       := 2*DIVIDE([Precision]*[Recall],[Precision]+[Recall])

Example source data

| id | predicted | actual |
|----|-----------|--------|
| 1  | 0         | 0      |
| 2  | 0         | 0      |
| 3  | 0         | 1      |
| 4  | 1         | 1      |
| 5  | 1         | 1      |
| 6  | 1         | 0      |
| 7  | 1         | 0      |

Source query

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUTIA41idaCUjFJ4xlGcI5pmAWTCeKQrPDMqD6DNH8GIB", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [id = _t, predicted = _t, actual = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"id", Int64.Type}, {"predicted", Int64.Type}, {"actual", Int64.Type}})
in
    #"Changed Type"
2
This is a really open question. I will try to have more feedbacks: (1) do you need to show the confusion matrix for a specific threshold? (2) do you need to plot the ROC?Seymour
@Seymour. (1) confusion matrix not really needed... Just F1 I think (though ultimately dependent on what users want) (2) i've plotted the ROC already and 0.5 is best, so no need for ROC in reportAnders Swanson
Ok, now is clear you were missing the verb in the Question sentence.Seymour

2 Answers

2
votes

In general the most elegant and performing way is not a simple line of DAX, instead it is the general approach you used to design your data model.

The general rule for optimization is: move to back-end as much as possible.

Having said this, you will always find yourself in a trade-off between query performance and total memory of your data model. If total memory is not a problem, the best practice defined by Microsoft is to perform sum of individual columns.

Here is my approach:
1. Modify M-query to add 4 new columns we will use as field to aggregate. In general, do prefer modifying the data source instead of creating Calculated Columns. The resulting data source should be the following:

+----+-----------+--------+-------+-------+-------+-------+
| id | predicted | actual | TP_fl | TN_fl | FP_fl | FN_fl |
+----+-----------+--------+-------+-------+-------+-------+
|  1 |         0 |      0 |     0 |     1 |     0 |     0 |
|  2 |         0 |      0 |     0 |     1 |     0 |     0 |
|  3 |         0 |      1 |     0 |     0 |     0 |     1 |
|  4 |         1 |      1 |     1 |     0 |     0 |     0 |
|  5 |         1 |      1 |     1 |     0 |     0 |     0 |
|  6 |         1 |      0 |     0 |     0 |     1 |     0 |
|  7 |         1 |      0 |     0 |     0 |     1 |     0 |
+----+-----------+--------+-------+-------+-------+-------+
  1. Next, let's define the base measures:

[TP] := SUM('Eval'[TP_fl]) [TN] := SUM('Eval'[TN_fl]) [FP] := SUM('Eval'[FP_fl]) [FN] := SUM('Eval'[FN_fl])

Please note the following measures are based on the assumption Eval[Id] is the Primary Key of Eval table, as in your example.

  1. Finally, let's define complex measures which are calculated using above base measures:

[Precision] := VAR _TP = [TP] RETURN DIVIDE(_TP,_TP +[FP]) [Recall] := VAR _TP = [TP] RETURN DIVIDE(_TP,_TP + [FN])
[F1 Score] := VAR _TP = [TP] VAR _FP = [_FP] VAR FN_ = [_FN] VAR _PRECISION = DIVIDE(_TP,_TP +[FP]) VAR _RECALL = DIVIDE(_TP,_TP + [FN]) RETURN 2*DIVIDE(_PRECISION * _RECALL, _PRECISION + _RECALL)

Two important remarks:

  • In complex measures, the use of variables is key to avoid wasting computing time.
  • This optimal answer is against software engineering principle of redundant code. As explained there is always a trade-off. In this scenario, mathematical formulas won't change.
1
votes

DISTINCTCOUNT is a bad performing operation. There is a chance to get performance by swapping it with an iterator such as SUMX, for example:

True Positive  = SUMX ( Eval, IF ( Eval[predicted] && Eval[actual], 1 ) )
False Positive = SUMX ( Eval, IF ( Eval[predicted] && NOT Eval[actual], 1 ) )
True Negative  = SUMX ( Eval, IF ( NOT Eval[predicted] && NOT Eval[actual], 1 ) )
False Negative = SUMX ( Eval, IF ( NOT Eval[predicted] && Eval[actual], 1 ) )

Edit: If you don't need the actual measure definition for the numbers in each confusion matrix quadrant, you may calculate the F1 score directly. Using variables is helpful for both readability and performance.

F1 Score =
VAR TruePositive = SUMX ( Eval, IF ( Eval[predicted] && Eval[actual], 1 ) )
VAR Selected = SUMX ( Eval, IF ( Eval[predicted], 1 ) )
VAR Relevant = SUMX ( Eval, IF ( Eval[actual], 1 ) )
VAR Precision = DIVIDE ( TruePositive, Selected )
VAR Recall = DIVIDE ( TruePositive, Relevant )
RETURN 2 * DIVIDE ( Precision * Recall, Precision + Recall )