Summary
I have a simple power BI report with 3 KPIs that take 0.3 seconds to refresh. Once I move the data to Azure Dedicated SQL Pool, same report takes 6 seconds to refresh. The goal is to have a report that would load in less than 5 seconds with 10-15 KPIs and up to 10 filters applied and connect to much larger (several million lines) datasets.
How to improve performance of Power BI connected via Direct Query to Azure Dedicated SQL pool?
More details
Setup: Dedicated SQL pool > Power BI (Direct Query)
Data: 3 tables (rows x columns: 4x136 user table, 4x3000 date table, 8x270.000 main table)
Data model: Main table to Date table connection - many-to-one based on “date” type field. Main to User connection - many-to-one based on user_id (varchar(10)). Other data types used: date, integers, varchars (50 and less).
Dedicated SQL pool performance level is Gen2: DW200c. However, the number of DWUs used never exceeds 50, most often it stays under 15.
Report: The report includes 3 KPIs that are calculated based on the same formula (just with “5 symbols text” being different):
KPI 1:=
CALCULATE(
COUNTA('Table Main'[id]),
CONTAINSSTRING('Table Main'[varchar 50 field 0], "5 symbols text"),
ISBLANK('Table Main' [varchar 50 field 1]),
ISBLANK('Table Main' [varchar 50 field 2])
)
The report includes 6 filters: 5 “basic filtering” with 1 or several values chosen, 1 “string contains” filter.
Problem: The visual displaying 3 KPIs (just 3 numbers) takes 6…8 seconds to load, which is a lot, especially given the need to add more KPIs and filters. (compared to 0,3 seconds if all data is loaded in .pbix file). It is a problem because adding more KPIs on the page (10-15) increases the time to load proportionally or even more for KPI that are more complicated to calculate and reports become unusable.
Question: Is it possible to significantly improve the performance of the report/AAS/SQL pool (2…10 times faster)? How? If not, is it possible to somehow cache the calculated KPIs / visual contents in the report or AAS without querying the data every time and without keeping the data in pbix or AAS model?
Solutions tried and not working: Sole use and different combinations of clustered columnstore, clustered rowstore, non-clustered rowstore indexes. Automated statistics on/off. Automated indexes and stats do give an improvement of 10…20%, but it is definitely not enough.
Simple list of values (1 column from any table) takes 1.5 to 4 seconds to load)
What I have tried
Moving SQL pool from West Europe to France and back. No improvements
Applying indexes: row- and columnstore, clustered and non-clustered, manual and automatically defined (inl. statistics) - gives 10...20% improvements in performance, which does not resolve the issue.
Changing resource classes: smallrc, largerc, xlargerc. % of DWUs used still never exceeds 50 (out of 200). No improvements
Shrinking data formats and removing excessive data: minimal nvarchar(n) possible, the biggest is nvarchar (50), all excessive columns have been removed. No improvements
Isolating the model: I have a larger data model, for testing puproses I have isolated 3 tables into a separate model to make sure other parts do not affect the performance. No improvements.
Reducing number of KPI and filters With only 2 report filters left (main table fields only) the visual takes 2 seconds to load. With +2 filters on the connected date table 2,5 sec, with +2 filters on user table 6 sec. Which means that I would only be able to use 1-2 filter reports, which is not acceptable.