2
votes

A table that is similar to the data set I am working on (although much simpler) is below that I would like to calculate some measures on and then find the percentiles of the measures.

Table Name: Data
Owner  AgeRating   OtherRating
  A        1            2
  A        4            4
  A        4            6
  B        3            3
  B        3            9
  B        7            4
  C        8            8
  C        4            2

First - A little background: I start by taking an average of the ratings (By Owner) and then normalize all ratings by dividing each rating by the maximum owner's rating - This creates the measure I would like to take the percentile of:

NormAgeRating=
average(Data[AgeRating])/
calculate(
    maxx(
        SUMMARIZE(Data,[Owner],"avg",average([AgeRating]))
       ,[avg]
    )
   ,all(Data[owner])
)

I have a pivot table with Rows being the owner which then looks like

Owner NormAgeRating
 A        .5
 B       .72
 C        1

Now for the question: I would like to get the .33 percentile.inc of the new NormAgeRating. I would like to use this to classify each owner into groups (<=33%ile or > 33%ile) This is what I am trying to get to:

Owner NormAgeRating   33%ile   classification
 A        .5            .64     bottom
 B       .72            .64     top
 C        1             .64     top

I have tried this with no success and many other variation with different groupby's etc. and continually get the wrong value:

33%ile=percentilex.inc(all(data[owner]),[NormAgeRating],0.33)

Any help would be greatly appreciated

Update: When I try sumx countx and averagex in the form:

=
averagex(
    SUMMARIZE(
        all(Data[Owner]),
        [Owner],
        "risk",[NormAgeRating]),
    [risk]
)

I am getting the right values, so I am not sure why using percentilex.inc/exc would produce the wrong values...

1

1 Answers

0
votes

PERCENTILEX (and all iterator functions) operates row by row on the table in the first argument. Therefore, you need that table to be at the desired granularity before you try to compute the percentile, which means you need to summarize Data[Owner] so that you have a unique row per owner rather than iterating over the raw column.

Keeping this in mind, both measures can be written similarly:

NormAgeRating = 
DIVIDE (
    AVERAGE ( Data[AgeRating] ),
    MAXX (
        SUMMARIZE (
            ALL ( Data[Owner] ),
            Data[Owner],
            "Avg", AVERAGE ( Data[AgeRating] )
        ),
        [Avg]
    )
)

33%ile = 
PERCENTILEX.INC (
    SUMMARIZE (
        ALL ( Data[Owner] ),
        Data[Owner],
        "Risk", [NormAgeRating]
    ),
    [Risk],
    0.33
)