0
votes

I am trying to calculate a median on timeseries monitoring data.

The OLAP cube has a date dimension (lowest level is date) and a seperate time dimension. (lowest level is minute)

I've found that it calculates correctly for a given day, but it rolls up the values for the month. (it should be ~1.55 for the month)

WITH MEMBER [Measures].[Medianx] AS MEDIAN([Time].[Time Key].MEMBERS , [Measures].[Total of Values])      

SELECT NON EMPTY {
[Measures].[Count of Values], 
[Measures].[Minimum Value], 
[Measures].[Maximum Value], 
[Measures].[Total of Values], 
[Measures].[Medianx]
} ON COLUMNS,  

NON EMPTY ( [Date].[Date].[Date].MEMBERS )  ON ROWS 

FROM [Environmental Data]

Gives this:

enter image description here

The Median for the Month can't be higher than median of the 5 individual days.

How do I get OLAP to calculate the Median correctly across the time range selected when rolling up? (e.g by Day and Month and Year)

Could it be caused by having seperate date and time dimensions?

Edit: I've found that if I just a degereate dimension wihc has a 1:1 relationship with the fact. It calcalates correctly for the month, but is really slow. (2min30sec for 1429 rows of data)

WITH MEMBER [Measures].[Medianx] AS MEDIAN([Data Attributes].[DETL Key].MEMBERS, [Measures].[Total of Values])

However, if I try to query grouping by day it never finishes the query.

1

1 Answers

0
votes

I found I could use the DESCENDANTS function to drill down through both the Date & Time dimensions to get to an average aggregate with a granularity of 1 minute per day.

MEDIAN(NonEmptyCrossJoin(DESCENDANTS([Date].[Calendar Date].CurrentMember, , LEAVES), DESCENDANTS([Time].[Time Key].CurrentMember, , LEAVES)), [Measures].[Average Value])

Testing found than given the nature of the data an average per minute gave a reliable median. Because it's calculated on aggregates rather than raw data it runs very quickly.

I don't really understand why, but it now calculates correctly for the day, month and year.

Warning: If I select 5 days of the month i will get the correct median for each day, and I will get the correct median for the entire month. (Rather than the median of the 5 days selected).