0
votes

I have a doubt related to the data export from Google Analytics into BigQuery.
Basically, I have configured the streaming export on the Google Analytics side to, in real time, export the data into the BigQuery (table ga_realtime_sessions_YYYMMDD). This streaming is working fine.

At some point at the end of the day, the data from this real timetable is exported into the ga_sessions_YYYYMMDD.

What I need to be explained is how this export (from the real timetable into the ga_sessions one) works.

I have several automatic processes that run around 8 AM (Portugal timezone) and, in the last days, these processes are failing due to the fact that the ga_sessions for the previous day are not created yet.
I checked the time that the ga_sessions are created for every day and this time is very volatile, and for some cases is around 2 AM, 3 AM but in another case is around 7 AM, 8 AM. This time difference could be due to the data size that needs to be exported from the real timetable into the ga_sessions one?

1

1 Answers

0
votes

The exports of daily sessions in BigQuery are indeed not completed at the same time everyday. This is due to a fully managed backend, which depends on workloads worldwide.

I suggest that you create an event listener on file creation for ga_sessions_YYYYMMDD, so that only once it is created you can then safely run dependent processes.

E.g. you can export the file in a Cloud Storage bucket, then use a trigger with a Cloud Function.