0
votes

I am looking for an answer regarding report data storage concept in Power BI.

I have published 3 reports to Power BI service (cloud):

  1. Report1 with Excel source

  2. Report2 with onpremise Sql server source

  3. Report3 with azure sql source

Around 200 users in my organization will be accessing these reports. I want to understand whether:

  1. The first time a particular report is accessed, will the data be fetched from the source and shown in the report or will it be stored to some cloud location from where the data will go to the report?

  2. Suppose a user opens a report that was already viewed by another user, then will the data be fetched from the source again or is there any concept of cross user shared cache?

  3. Suppose a user opens the report for the 2nd time (example: after having already accessed it, suppose user refreshes the web page), will the data will be fetched again? Or is there any concept of shared cache?

  4. Does the answer to any of the above change if I had used the Power BI reporting server (onpremise) and deployed the report on the PBRS?

1

1 Answers

1
votes

With the service, you typically upload a PBIX, which contains the report pages and all of the underlying data. Unless you set up a data gateway to accommodate DirectQueries and/or scheduled refreshes, the cloud service does not access your original data sources at all. With a scheduled refresh, it only accesses the original data during the refresh. A DirectQuery connection does access a server "live" but has many limitations.

  1. The data is fetched when you load it into your Power BI desktop application and then loaded into the cloud when you publish the report to a workspace. Once it's there, the data shown to the user is fetched from the cloud copy, not the original data source.

  2. Same answer as above regarding where the data is fetched from (the cloud copy). I don't believe there is shared cache between users but rather each user has some temporary caching individually. This type of caching saves the calculation results (computed on the underlying data) that are needed to populate the report visuals.

  3. There is some caching done temporarily so that if a user switches among slicer combinations to one previously chosen you may see much quicker loading than when selecting a new configuration since it cached the results and doesn't need to recompute them. As far as I understand, this kind of caching is short-lived and not shared among users. Remember, this type of cache is not the same as the underlying data in the cloud copy of the PBIX.

  4. I've not used an on-premise server, but I would expect the behavior to be similar with the exception that the service is on the local server instead of a could server somewhere else.

The upshot is that traffic in the service is separated from the requests to the original source data (assuming no DirectQuery connections). Those original sources are only accessed during data refreshes, which are independent of end-user actions (under the same assumption).