I recently attempted to query the Google Analytics API for a report using device category, source, and medium as dimensions. The report covered about four weeks of time. Despite the fact that I was able to build the equivalent ad-hoc report in the UI and get results based on 100% of sessions, I couldn't get the API to give me results based on any more than 1.3% or so of sessions. The client I'm using is based on the v3 API, but I got the same results when using Google's v4 testing tool, so it's not a function of the API version.
According to Google's documentation, ad-hoc reports are supposed to use pre-aggregated unsampled data where possible:
Ad-hoc reports are based on any non-standard query of Analytics data. For example, if you apply a segment or secondary dimension to a standard report, then Analytics has to issue a new, non-standard query of the data to return that information.
The new query goes first to the tables of aggregated data to see if all of the requested information is available there. If the information is not available there, then Analytics queries the complete, unfiltered set of data and computes new aggregates to satisfy the application of the segment or secondary dimension.
This is apparently true of the web UI, but not necessarily of the API. I was under the impression that the web UI was making calls equivalent to those exposed in the API under the hood, but it seems that this isn't the case. Does anybody know whether it's possible to force the API query to use the pre-aggregated data sets that I know are available?
samplingLevel
toLARGE
– Matt