5
votes

What I need is reducing process time of a SSAS cube using AMO while adding data to a fact table in data warehouse.

According to Microsoft document on SSAS:

Process Add

"For dimensions, adds new members and updates dimension attribute captions and descriptions. For measure groups and partitions, adds newly available fact data and process only to the relevant partitions."

Process Full

"Processes an Analysis Services object and all the objects that it contains. When Process Full is executed against an object that has already been processed, Analysis Services drops all data in the object, and then processes the object. This kind of processing is required when a structural change has been made to an object, for example, when an attribute hierarchy is added, deleted, or renamed."

Thus with following code, I could have at least similar processing time per record regardless amount of data in data warehouse.

var start = DateTime.Now;
var query = "SELECT [dbo].[FactGradingResult].* FROM [dbo].[FactGradingResult]  WHERE ([Id] = "+ grading2.Id+")";
ptn.Process(ProcessType.ProcessAdd,
    new QueryBinding(dsv.DataSourceID, query));
var end = ptn.LastProcessed;
swch2 = (end - start).TotalMilliseconds;

But the ProcessAdd still takes about 900 ms to update the cube with a single row of fact table. Is this typical time of ProcessAdd of SSAS with a 8-core 2.5GHz machine of Windows Server 2012? If not, how can I improve cube processing time per a row to fact table?

2
What is the frequency of your reprocessing? Are you trying to real-time refresh the cube? Or is this a nightly refresh? Have you partitioned the measure group? Partitioned by what? Do new rows always arrive in the most recent partition?GregGalloway
I would suggest you test the performance of ProcessAdd to add one row to a million row partition vs. adding one row to a small partition. My guess is it will slow down the larger the partition gets. So I would recommend partitioning so you're doing your ProcessAdd on a relatively small partition.GregGalloway

2 Answers

5
votes

Based upon your clarification that you're attempting a real-time refresh of the cube, I would recommend you weigh which is more important to you. Is query performance more important? Then continue what you're doing since MOLAP query performance will be better than my ROLAP suggestion below.

However, if it is more important you have zero latency in getting new data into the cube then I would suggest you switch to some approach which includes ROLAP.

To describe the ROLAP approach further, I would recommend creating one MOLAP partition which contains all the data in your fact table through yesterday. Then create a second partition which is set to StorageMode=ROLAP and which is filtered to any rows which arrived today. Then every night, alter the partition definitions appropriately and reprocess the first partition. Make sure all your dimensions are MOLAP as I'm assuming they don't need to change real-time. (I would avoid ROLAP dimensions for performance reasons.)

You will probably have to set the "Real Time OLAP=true" property on the connection string as described here.

2
votes

I believe process add with correct partitioning would do this for you. Not an expert in this, but here is a presentation on how to do it from a SQL Saturday I went to a while back. http://www.sqlsaturday.com/SessionDownload.aspx?suid=3708