1
votes

How is it possible to do an dynamic histrogram using MDX ?

For example, our schema is based on web visits, we've the number the sessions and the number of click-outs. We would like to have the number of session with one click-out taking into account that this might depend on other dimensions (country, hour, entry-page...).

1
Please can you add some more detail to this question. Also a starting mdx script. Maybe also a mock-up of what you're aiming for.whytheq
see answer below, it's a question recurring we're getting. So I'm answering here to share itic3

1 Answers

1
votes

To solve this we are going to work with two different concepts. First create a new Hierarchy and afterwards use MDX+.

First we've to create a new dimension, [Histrogram]. This new dimension will contain the defintion of the buckets with two member properties : start-bucket and end-bucket. A pseudo table that looks like

Name start-bucket end-bucket
0-1       0           1
1-2       1           2
2-3       2           3
...
10++     10   2147483647

This Hierarchy is not linked to the facts and defines for each member two properties defining a bucket.

Let's put this to use in MDX.

Let's assume we've a dimension, [Sessions], and a measure, [click-outs]. First we're going to use the OO features of icCube and create a vector that for each session calculates the number of [click-outs]

-> Vector( [Sessions], [click-outs], EXCLUDEEMPTY )

Vector has a function, hist(start,end), that does exactly what we need and is counting all occurencies between start and end (excluded).

Vector( [Sessions], [click-outs], EXCLUDEEMPTY )->hist(0,1)

Putting this together with our newly created hierarchy allows to automize the calculation for all buckets. The const function ensures the vector is calculated only once as it might be time consuming.

The final MDX looks like (note that both function and calc. members could be created in the schema script, once per schema):

WITH
 CONST FUNCTION ClicksBySession() AS Vector( [Sessions], [Measures].[click-outs], EXCLUDEEMPTY )
 MEMBER [Session/Clickout] AS  ClicksBySession()->hist( [Histogram].currentMember.properties("start-bucket", TYPED) , [Histogram].currentMember.properties("end-bucket", TYPED) 
SELECT
   {[Session/Clickout] } on 0,
  [Histogram].on 1
FROM [clickout]
--where  [Geography].[Europe]

And there you've an histrogram that is calculated dynamically that can be easily inserted in a dashboard and reused.