0
votes

I am detecting some discrepancies between the total number of sessions in my data on Google Analytics and BigQuery. For calculating this in BQ, I use the following:

SELECT date, 
EXACT_COUNT_DISTINCT(CONCAT(fullvisitorid, STRING(visitid))) 
AS sessions 
FROM [MyData]

The results obtained with this query are these:

Query Results Row date sessions
1 20150601 1269258
2 20150602 1269299
3 20150603 1131770

In Google Analytics, the numbers of sessions per day are:

Day Index Sessions 6/1/2015 1113088
6/2/2015 1112203
6/2/2015 993178

Is there any reason for this difference?

1
AFAIK in analytics when you run a report it's an approximation and it's displayed on your upper left of your screen. - Pentium10
But I think this was supposed to be unsampled data. - Silvia Pina
@SilviaPina I'm more of a BQ/App Engine guy, but looking in the GA docs, I found this : support.google.com/analytics/answer/… That says your data might be sampled if you have a high volume of data. I don't know what they consider high... but 11 million sessions might hit the threshold - Patrice
What numbers do you get if you do "SUM(totals.visits) as sessions" instead? - Felipe Hoffa
With that variant I get yet another result: 20150601 1268011 ; 20150602 1269063 ; 20150603 1131540 - Silvia Pina

1 Answers

0
votes

You may want to try the following query

SELECT date2, 
       Count(DISTINCT sessionid) 
FROM  (SELECT Concat(clientid, "-", visitnumber, "-", date)       AS sessionId, 
              Date_trunc(Date(Parse_date('%Y%m%d', date)), month) AS date2 
       FROM   `project-id.dataset-id.ga_sessions_*` 
       WHERE  _table_suffix BETWEEN '20191201' AND '20210131' 
              AND totals.visits = 1) 
GROUP  BY date2 
ORDER  BY date2 

The reason of discrepancy and related code are given in this article: Google Analytics + BigQuery Tips: Users and Sessions