Good day!
I have a relational data source with 3 fact tables which are related to each other and model
patient doctor visits (EncounterEventFact table), assignment of a diagnosis to patient (DiagnosisEventFact table) and collecting lab results for the patient (LabComponentResultFact table) in the picture below. They all share EncounterKey a key, specifying a unique visit to a doctor. All EncounterKeys are in EncounterEventFact table and only once. I am using SSAS tabular in memory to model the data.
Each fact table has a few million rows (2-4 mil). DiagnosisDim has few tens of thousands of rows. PatientDim has few millions rows (<10 mil). LabDim has few hundred records. This is a simplified model with just 3 main dimensions.
My measures are a distinct count of EncounterKeys on DiagnosisEventFact and distinct count of EncounterKeys on LabComponentResultFact.
The sample report which is being developed is given a selection on diagnosis dim side, show counts of unique visits for this diagnosis and counts of lab collection visits for this diagnosis.
diagnosis count_visits_diagnosis count_visits_labs
ABC 5 0
DEF 10 5
etc
The problem comes when I select ALL diagnoses in the report. Counts for diagnoses measure loads instantly, while counts of labs measure takes forever. When I select a few diagnoses report works fast.
How to improve my SSAS tabular model or computation to handle this sort of reporting efficiently?
Unfortunately I can not re-model data on the data source side.