0
votes

I'm working on a few queries to gauge day-over-day performance using Google Analytics data in BigQuery. My project is currently set up with data streamed in from GA. The problem is that yesterday's table, ga.us.ga_sessions_20181105, isn't created until sometime in the late morning.

The prior day's real-time view, ga.us.ga_realtime_sessions_view_20181105, still exists before yesterday's table is created.

Is there a way to query for both yesterday's table and yesterday's real-time view, and handle the table not found error if yesterday's table hasn't been written yet?

Something like this:

SELECT
  date,
  SUM(totals.visits) AS visits
FROM
    -- If this table doesn't exist yet
    TABLE_DATE_RANGE([ga:us.ga_sessions_],
    DATE_ADD(DATE_ADD(CURRENT_TIMESTAMP(), -6, 'HOUR'), -1, 'DAY'),
    DATE_ADD(DATE_ADD(CURRENT_TIMESTAMP(), -6, 'HOUR'), -1, 'DAY')),
    -- Query this view instead
    TABLE_DATE_RANGE([ga:us.ga_realtime_sessions_view_],
    DATE_ADD(DATE_ADD(CURRENT_TIMESTAMP(), -6, 'HOUR'), -1, 'DAY'),
    DATE_ADD(DATE_ADD(CURRENT_TIMESTAMP(), -6, 'HOUR'), -1, 'DAY'))
WHERE
  TIME(DATE_ADD(SEC_TO_TIMESTAMP(visitStartTime), -6, 'HOUR')) <= TIME(DATE_ADD(CURRENT_TIMESTAMP(), -6, 'HOUR'))
GROUP BY
  date;

SOLUTION: Thanks to @Bobbylank, I'm just selecting both tables in the same query:

SELECT
  date,
  SUM(totals.visits) AS visits
FROM
  TABLE_DATE_RANGE([ga:us.ga_sessions_],
    DATE_ADD(DATE_ADD(CURRENT_TIMESTAMP(), -6, 'HOUR'), -1, 'DAY'),
    DATE_ADD(DATE_ADD(CURRENT_TIMESTAMP(), -6, 'HOUR'), -1, 'DAY')),
  TABLE_DATE_RANGE([ga:us.ga_realtime_sessions_view_],
    DATE_ADD(DATE_ADD(CURRENT_TIMESTAMP(), -6, 'HOUR'), -1, 'DAY'),
    DATE_ADD(DATE_ADD(CURRENT_TIMESTAMP(), -6, 'HOUR'), -1, 'DAY'))
WHERE
  TIME(DATE_ADD(SEC_TO_TIMESTAMP(visitStartTime), -6, 'HOUR')) <= TIME(DATE_ADD(CURRENT_TIMESTAMP(), -6, 'HOUR'))
GROUP BY
  date;
1

1 Answers

2
votes

I seem to remember having this problem and getting around it by selecting the ga_sessions for 2 days before and the previous day and the realtime table for the previous and current day.

That ensured the query wouldn't error as there should always be at least one table available for each day.

I then just selected the rows for the previous day from the resulting table.