1
votes

I have identified 3 dimensions and 1 measure table. It will be Star schema.

My measure group would have Count(A/C number). Each dimension table has look up table tied to A/c number kind of one to one relationship.

Dim1 ID1 Cat1

Dim2 ID2 Cat2

Dim3 ID3 Cat3

Fact A/c number Count(A/c) ID1 ID2 ID3 Above is just an example, Of course in real time there are 15 dimension table(one to one relation) with fact table and data close to million records that's why we need to come up with best design/performance.

I know FACT/Measure is always aggregate or a measure of business and in this case measure is count(A/c number).

Question: 1. Do i need to add A/c number to the fact table. Remember adding A/c number to the fact table, fact would be huge/big. Good or bad, performance wise??

  1. Do i create additional Factless fact table similar to fact table but fact table will have only count(a/c number) and Factless fact table would have actual a/c numbers with dimension values too.. this would be a big table. Good or bad, performance wise??

  2. Do i create additional column(a/c number) along with look up values on the dimension tables so fact table would have facts.. Good or bad, performance wise??

  3. Also i need to know, dimension process/deploy is faster(or should be faster) fact process/deploy is faster(or should be faster) and what's preferred in real time.

I want to know which option to select in real time or is there better solution. Please let me know!!

2

2 Answers

2
votes

If I understood correctly you are talking about a degenarated dimension.

That´s a common practice and in my opinion is the correct way to tackle your issue.

For instance, let´s say that we have a order details table with a granularity of one row per order line. Something like this: Please visit this link to see the image because I'm not still able to post images in the forum: http://i623.photobucket.com/albums/tt313/pauldj54/degeneratedDimension.jpg

If your measure is the count of orders, from the example above the result is: 2 Please check the following link:

Let me know if you have further questions.

Kind Regards,

Paul

0
votes

You described your volume as "close to million records". That sounds trivial to process on any server (or even desktop or laptop) built in the last 5 years.

Therefore I would not limit the design to solve an imagined performance issue.