0
votes

Recently started implementing multidimensional model in SSAS. OLTP has a table which stores the multiple measures based on Attribute and Entity columns. Attribute column has various dimension names. If in a row the Attribute value is Dim1 then Entity column will have Dim1's Value. Similarly, Attribute column can have any dimension name and its value will be in the Entity column. We also have some tables where we have multiple Attributes and entity columns. e.g. Attribute1 and Attribute2. Both are dimension names and entity1 and entity2 are storing their respective values. Measures also depends on the order of the dimensions. To be specific, tables are storing the Value at the Risk(VAR) values calculated at different compression. VAR calculated at fund and Sector compression are different from the VAR calculated at the Sector and fund compression. OLTP stores it as attribute1 (Fund),Attribute2(Sector) and Entity1 (Fund's value) and Entity2(Sector's value). Querying it is also easy with where caluse [where attribute1 = 'Fund' and attribute2 ='Sector'] or [where attribute1 = 'Sector' and attribute2 ='Fund']
How this can be effectively modeled in a Cube?

My current approach is creating a Fact table which has a separate not null foreign key column for each each dimension . If i need to save the data for the dimension1 then its foreign key value will be saved in dimension1 (which is foreign key to Dimension1 table) other dimension keys(dimension2,dimension3..) will point to N/A value of respective dimension.

How this approach can be improved? Pros and cons ? How the ordering of the dimensions can be achieved in cube design

1
As SebTHU has said, it'd helpful if you could edit this to say what the various business concepts you're trying to model are. You've mentioned Sector, Fund, and VAR - are there more, or just those? I think it's possible that you're basing too much on the OLTP database structure when what you need to be modelling are the business processes and concepts, but it's hard to tell when just discussing generic dimensions.Jo Douglass
Also, where you think it would be helpful to know the structure of your current tables, could you show us the actual structures (with modified names if you need to anonymise things), rather than describing them?Jo Douglass
@Jo Douglass: Question is already edited. We have several measures like ImpliedVol,Vol,VAR etc and dimensions are Fund,Symbol,Sector,Subsector,Country,Masterfund etc. What we need is a fact table where those measures can be stored for different compression e.g Fund , Symbol, Fund and Symbol, Symbol and Fund, Sector and fund. When we have Fund compression then foreign key for other comression (dimension) will point to N/A value of that dimension. How the ordering should be maintained about (fund and symbol) or (symbol and fund) while fetching the data ?107
When creating a fact table you need to think about the grain - if you have a measure which exists at different grains (which is what it sounds like when you say different compression, and needing to leave some dimension keys null), then you will need multiple measures in different fact tables. You can avoid that if you can split each measure down to the lowest grain, hold it at that level only, then sum it up in the cube. Are there any natural hierarchies between the business concepts you've listed?Jo Douglass
@Jo Douglass: Measures are can not be aggregated among different grain.Natural hierarchies do exist. One hierarchy is Fund -> Symbol. Still we can not sum up the Symbol Var numbers to get the Fund var number.107

1 Answers

0
votes

I'm not sure exactly what you mean by this:

creating a Fact table with all the dimensions as the foreign key which points to actual dimension tables

but it sounds like the wrong approach.

A fact row should have a separate non-NULL foreign-key column for every single dimension that slices the fact table. (Even if, occasionally, some of these point to a particular "Unknown" member in a particular dimension). In other words, each fact row should represent the intersection of particular members from every (relevant) dimension.

From the sound of it your problem is that your source system suffers from a database antipattern: that of making attributes "flexible" by allowing attributes to be added adhoc, simply by creating a row with (AttributeName, ObjectThisIsAnAttributeOfID, AttributeValue), rather than defining this relation as a table structure with a FK to the main table.

There's going to be a lot of data analysis and transformation needed to make a cube design result from this data structure:

  1. How many distinct values of Attribute are there? Each distinct value is a candidate to be a dimension in the cube.
  2. For each Attribute, how many distinct values of Entity are there? Do they share a common data type? Do you know what future values might be? This will need to be done for each dimension
  3. For each "thing" (fact) that has attribute values: do they consistently have values for ALL the attributes? Or for a subset of the attributes?

You may find that (given the work and time involved) you can only model a small subset of the existing attributes.

Feel free to comment or reply with more details in case I've misunderstood.

But from the sound of it the basic problem is that your source system doesn't assert any rules about the relations between "things" and their attributes. OLAP cube design involves strongly asserting these kinds of rules, at initial design-time.