0
votes

I've been using a SSIS Integration component to download data from Google Analytics in order to keep an historical view of some websites and track the evolution of them. Basically the metrics we track are Visits (now Sessions) and Visitros (now Users), and the dimensions are Year and Month. However, today I noticed that the data I downloaded for july had a variation on the Users metric. I heard that google analytics uses an estimation method to "calculate" some (if not all) of their metrics, could it be that after that they "adjust" the data with more acurate information? If so, is this mentioned in the documentation? (a link would be highly appreciated) Since the users are complaining that we are not delivering the real GA Data. I tried looked on the Google analytics documentation page with no luck.

Thanks for your time.

PS: Sorry for my english, it isn“t my native language

2
1. Depending upon when you down load the data it can take time for it to finish processing. Up to 48 hours. 2. Depending upon how much data you are extracting your data could be sampled. 3. Whos SSIS task are you using? - DaImTo
Hello, thanks for your answer, im using a ssis task from this page ssis-components.net. However before using it i validated that the data i download its the same as the one showed on the google analytics platform. I Run my extracction task every 1st day of the month at 3 am - OcR
You should contact them, sounds like your data is sampled or something. Try running again for July now and see if it still returns incorrect data. Or try running this ga-dev-tools.appspot.com/explorer it will at least show you what numbers your SSIS task should be returning. - DaImTo

2 Answers

0
votes

If you are using the standard version of Google Analytics (you'll know if you are paying $150k for premium), data is sampled depending on volume. Have a read of this article can-you-trust-your-google-analytics-data

I have seen very slightly differing results being returned if you repeatedly call the api with the same historical parameters repeatedly. In my case the figures only differed by 1-2 over a daily set of several thousand, but nevertheless it differed.

If you want to guarantee your results, consider upgrading to premium

0
votes

Sampling could be an issue if what you are requesting is over 50,000 rows for the time period you are requesting. To avoid it you can download more often, such as daily.

But I think your issue is that there is a processing time for Google Analytics - if you are downloading at 3 am on the 1st it is probable that the processing for the previous day has not finished.

Google Analytics Premium SLA is for 4 hour data freshness, so even that would have trouble. Pragmatically you should allow 24 hours before you download data for the previous day, 48 hours for e-commerce data.

Thirdly make sure it is not Unique Visitors you are requesting, as this is dependent on the time period you are requesting.