I am trying to write multiple DAX expressions to summarize a transaction dataset. The combination of [Manufacturer] and [PartNo] will determine uniqueness so those fields need to be grouped together. I can also concatenate the 2 fields together and create a composite key ([ManufacturerPartKey]) if its easier grouping data that way.
MyTable sample data:
Manufacturer PartNo TxDate TxStatus ErrorCode ManufacturerPartKey
ABC 123 9/29/2020 Fail CodeInvalid ABC-123
ABC 123 9/30/2020 Fail CodeInvalid ABC-123
ABC 123 10/1/2020 Success null ABC-123
ABC 789 10/1/2020 Fail PartInvalid ABC-789
ABC 567 10/1/2020 Success null ABC-567
XYZ 567 9/29/2020 Fail LoadFail XYZ-567
XYZ 567 9/30/2020 Fail LoadFail XYZ-567
XYZ 789 10/1/2020 Fail APIFault XYZ-789
LMO 456 9/29/2020 Fail APIFault LMO-456
LMO 456 9/30/2020 Fail APIFault LMO-456
EFG 123 10/1/2020 Success null EFG-123
QRS 123 10/1/2020 Fail PartInvalid QRS-123
QRS 123 10/2/2020 Fail PartInvalid QRS-123
QRS 123 10/3/2020 Fail PartInvalid QRS-123
QRS 123 10/4/2020 Fail PartInvalid QRS-123
QRS 567 10/4/2020 Success null QRS-567
Requirements:
Count of distinct combinations of Manufacturer + Part No
Expected Output = 9
Count of Success
Expected Output = 4
Count of "outstanding" failures by [Manufacturer] and [PartNo] or [ManufacturerPartKey] and max TxDate
Expected Output = 5
Answer:
CountFailure =
VAR _maxDate =
TREATAS (
ADDCOLUMNS (
VALUES ( MyTable[ManufacturerPartKey] ),
"maxDate", CALCULATE ( MAX ( 'MyTable'[TxDate] ) )
),
MyTable[GroupKey],
MyTable[TxDate]
)
RETURN
CALCULATE (
COUNTROWS ( SUMMARIZECOLUMNS ( MyTable[ManufacturerPartKey], MyTable[TxStatus], _maxDate ) ),
MyTable[TxStatus] = "Failure"
)
Count of reprocessed (ABC-123 was submitted 3 times and was re-processed on the 3rd attempt)
1
Count of distinct error codes by max TxDate
ErrorCode Count CodeInvalid 0 APIFault 2 LoadFail 1 PartInvalid 2 Null 4
Expected Summary Table Output:
Manufacturer TotalSubmissions TotalFailed TotalSucceeded DistinctPartNoCount OutstandingFailures ReprocessedCount
ABC 5 3 2 3 1 1
XYZ 3 3 0 2 2 0
LMO 2 2 0 1 2 0
EFG 1 0 1 1 0 0
QRS 5 4 1 2 0 0
EDIT
New Sample with duplicates:
Manufacturer PartNo TxDate TxStatus ErrorCode ManufacturerPartKey
ABC 123 9/29/2020 Fail CodeInvalid ABC-123
ABC 123 9/30/2020 Fail CodeInvalid ABC-123
ABC 123 10/1/2020 Success null ABC-123
ABC 123 10/1/2020 Fail DuplicatePart ABC-123
ABC 789 10/1/2020 Fail PartInvalid ABC-789
ABC 567 10/1/2020 Success null ABC-567
ABC 567 10/2/2020 Fail null ABC-567
XYZ 567 9/29/2020 Fail LoadFail XYZ-567
XYZ 567 9/30/2020 Success null XYZ-567
XYZ 567 9/30/2020 Fail DuplicatePart XYZ-567
XYZ 789 10/1/2020 Fail APIFault XYZ-789
XYZ 789 10/1/2020 Fail APIFault XYZ-789
LMO 456 9/29/2020 Fail APIFault LMO-456
LMO 456 9/30/2020 Fail APIFault LMO-456
LMO 555 10/1/2020 Fail APIFault LMO-555
EFG 123 10/1/2020 Success null EFG-123
QRS 123 10/1/2020 Fail PartInvalid QRS-123
QRS 123 10/2/2020 Fail PartInvalid QRS-123
QRS 123 10/3/2020 Success null QRS-123
QRS 123 10/3/2020 Fail PartInvalid QRS-123
QRS 123 10/4/2020 Fail PartInvalid QRS-123
QRS 567 10/4/2020 Success null QRS-567
QRS 567 10/5/2020 Fail DuplicatePart QRS-567
QRS 222 10/5/2020 Fail APIFault QRS-222
QRS 222 10/6/2020 Success null QRS-222
QRS 222 10/6/2020 Success null QRS-222
New Summary Table Output:
Manufacturer TotalSubmissions TotalFailed TotalSucceeded DistinctPartNoCount OutstandingFailures ReprocessedCount
ABC 6 4 2 3 1 1
XYZ 5 4 1 2 1 1
LMO 3 3 0 2 2 0
EFG 1 0 1 1 0 0
QRS 10 6 4 3 1 2
Note:
If there is at least 1 failure and 1 success per ManufacturerPartKey, it should be Reprocessed = 1. The count should not increment even if there are duplicates. For QRS-567, there is 1 success and 1 failure but it shouldn't count as Reprocessed = 1 because the 1st time it loaded it was already successful. QRS-123 and QRS-222 have duplicates but are the two "reprocessed" PartNos for QRS.
If there is at least 1 failure and 0 success, it should be Outstanding Failure = 1. The Outstanding Failures count should not match total failures.
I understand these requirements are challenging given the imperfect dataset but that's what I have to work with. Go me.