1
votes

I am working on modernizing a reporting solution where the data sources are on prem on the customers' sql servers (2014) and the reports are displayed as Power BI reports on the customer's Power BI Service portal. Today I use SSIS to build a data warehouse, as well as an on premise data gateway to ensure the transport of data up to an Azure analysis services which in turn are used by the Power BI reports.

I have been wondering if I could use Azure Synapse to connect to customer data and in a most cost effective way transport data to Azure and link them to the Power BI workspace as a shared dataset. There are many possibilities, but it is important that the customer experiences that the reports are fast and stable, and if possible can cope with near real time.

I experience SSIS being cumbersome and expensive in azure. Are there mechanisms that make it cheap and fast to get data in azure? Do I need a data warehouse (Azure SQL database) or is it better to use data lake as storage for data? Needs to do incremental load too. And what if I need to do some transformations? Should I use Power BI dataflow or do I need to create Azure Data flows to achieve this?

Does anyone have good experience to use synapse (also with DevOps in mind) and get a good DEV, TEST and Prod environment for this? Or is using Synapse a cost driver and a simpler implementation will do? Give me your opinions and if you have links to good articles, please do so. Looking forward to hear from you

regards Geir

1

1 Answers

0
votes

The honest answer is it depends on a lot of different things and I don't know that I can give you a solid answer. What I can do is try to focus down which services might be the best option.

It is worth noting that a Power BI dataset is essentially an Analysis Services database behind the scenes, so unless you are using a feature that is specifically only available in AAS and using a live connection, you may be able to eliminate that step. Refresh options are one of the things that are more limited in Power BI though, so the separate AAS DB might be necessary for your scenario.

There is a good chance that Power BI dataflows will work just fine for you if you can eliminate the AAS instance, and they have the added advantage of have incremental refresh as a core feature. In that case, Power BI will store the data in a data lake for you.

Synapse is an option, but probably not the best one for your scenario unless your dataset is large, SQL pools can get quite expensive, especially if you aren't making use of any of the compute options to do transformations.

Data Factory (also available as Synapse pipelines) without the SSIS integration is generally the least expansive option for moving large amounts of data. It allows you to use data flows to do some transformations and has things like incremental load. Outputting to a Data Lake is probably fine and the most cost effective, though in some scenarios something like an Azure SQL instance could be required if you specifically need some of those features.


If they want true real time, it can be done, but none of those tools really are built for it. In most cases the 48 refreshes per day (aka every 30 minutes) available on a Premium capacity are close enough to real time once you dig into the underlying purpose of a given report.

For true real time reporting, you would look at push and/or streaming datasets in Power BI and feed them with something like a Logic App or possibly Stream Analytics. There are a lot of limitations with push datasets though- more than likely you would want to set up a regular Power BI report and dataset and then add the real time dataset as a separate entity in addition to that.


As far as devops goes, pretty much any Azure service can be integrated with a pipeline. In addition to any code, any service or service settings can be deployed via an ARM template or CLI script.

Power BI has improved in the past couple years to have much better support for devops and dev/test/prod environments. Current best practices can be found in the Power BI documentation: https://docs.microsoft.com/en-us/power-bi/create-reports/deployment-pipelines-best-practices