9
votes

Do Clustered Columnstore Indexes affect SSAS Performance querying for the end-user, and how could to fix this? Read an article below here how Sort order affects SSAS query customer user performance.

Is there any way to resolve this issue?

Would rebuilding SSAS indexes/aggregations work? Already know time to process the cubes from Data Warehouse to SSAS can be affected. What real concern is about is end customer-user SSAS querying experience.

Currently implementing Multidimension model in SSAS Cube.

1
Are you at a sufficient scale to be worried about this? You need to have 100GB+ cube sizes before these issues become significant concerns.Mike Honey
Are you sure there is an issue to begin with? That question is about cube processing, not end user querying. A cube has to read all data from a table during processing, which means it needs to crach the compressed buckets in a columnstore's dictionaries to reconstitute rows out of the columns. That's obviously slower than reading the rows directly from another table.Panagiotis Kanavos
End user querying of the SSAS cube won't be affected at all unless the cube has to read from the columnstore. If it has to though, it will probably be faster because a columnstore is a lot faster for aggregate queriesPanagiotis Kanavos
With properly designed and tuned columnstore tables you could use both columnstores and SSAS in DirectQuery modePanagiotis Kanavos
Also check Overview of SSAS Tabular in DirectQuery Mode for SQL Server 2016. It may actually be a good idea for certain scenarios to use SSAS and DirectQuery over columnstore indexes.Panagiotis Kanavos

1 Answers

2
votes

Well, it depends. Let's start from the problem definition.

  • SSAS multidimensional performs better when fed ordered data on processing step. This article gives you reasons why and insights on data ordering.
  • SSAS Index and Aggregations processing will not fix source data being not ordered; therefore, it will not fix problems described above. These processing tasks build artifacts based on data received, and it cannot fix problems with the data itself.
  • MS SQL Columnstore Index is roughly a new storage technology -- columnstore compression applied to heap tables. This gives fast insert (no indexes, no pre-sort required) compared to table with Clustered Index. Downside - SELECT query on table with Clustered Index is likely to return rows ordered on Clustered Index base (unless you set ordering with ORDER BY statement), while the same query on the Clustered Columnstore table will yield unsorted data.
    This problem of unsorted data with Clustered Columnstore index affects not only SSAS, it degrades query performance when CCI could do so called segment elimination. There are some techniques to defeat that - sorting data before converting regular table to CCI or sorting data on load to CCI table.
  • The main problem of the discussion you mentioned is that data ordering is done with additional views on SQL level. Then author defines partitions on SSAS, and reports that SSAS generated queries have suboptimal execution plans.

Regarding SSAS performance on unordered data. It certainly will be suboptimal, but to what extent? In fact, only tests will show it; it can depend on multitude of factors - initial data set, cube design, end user queries. Growth of cube structures will slow down operations, but how much? From the experience - I would bother and give efforts to provide data ordering if cube is 100+ GB and its biggest partition/measure group is more than 10% of RAM used by SSAS. In other circumstances I would not bother about such problem.

Ordering data from CCI. First, avoid obsolete syntax

SELECT TOP 2147483647 ... FROM ... ORDER BY ...  

Use ANSI-compliant and less restrictive

SELECT ... FROM ... ORDER BY ... OFFSET 0 ROWS  

Regarding suboptimal execution plan when used in SSAS Partition definition. Unfortunately, SSAS query generation engine does not allow magical option (recompile). Again, if this is a serious problem - define a table-valued function (parametric view) to achieve optimal execution plan, and use this TVF in SSAS partition definition.

If this is the first implementation of the project - I would go without such measures and report it as project risks which require attention on go productive and possibly - additional efforts after.

Unfortunately, rebuilding SSAS indexes/aggregations will not improve the situation. You need to pre-sort data on DB query level when feeding into SSAS.