0
votes

in an effort to fast-track a monthly comparison between Flurry and Firebase event-reporting, I wrote this simple script below to get the monthly 1) Event Count and 2) Count of Unique Users that triggered these events.

In theory the data I get from BigQuery (linked to the same Firebase account), should give the same output as the Firebase Analytics Console below. I only added a random example Screenshot to indicate which events I am talking about.

enter image description here

# My Standard SQL - Script
SELECT 
  event.name as event_name,
  COUNT(event.name) as event_count,
  COUNT(DISTINCT user_dim.app_info.app_instance_id) as users
FROM `project_id.com_game_example_IOS.app_events_*`,
  UNNEST(event_dim) as event
WHERE (_TABLE_SUFFIX BETWEEN '20170701' AND '20170731')
------ Inclusive for both the start-date and end-date.
GROUP BY event_name
ORDER BY event_count DESC
LIMIT 10;

Results:

  • The event_count from the BigQuery output is exactly the same as the output in my console, but the distinct user_count is out by about 5K-10K for most of the events. In theory, the Firebase Analytics Console's users data should line up exactly with the BigQuery-output, but it does not.

My Questions:

  • Is my query not correct, or is it that the Analytics Console outputs the wrong data ?
  • Can my query be even more optimised ?
  • Am I perhaps not considering something like Timestamps?

Thanks!

1

1 Answers

4
votes

There are a couple of things potentially at play here.

  1. See the answer/comments about COUNT( DISTINCT ) here.

  2. Earlier this year, Analytics launched HLL-based user metrics in order to provide customers with more flexibility when filtering user metrics by user properties and audiences. And so, although your event counts may match exactly between BigQuery and your Analytics reports, your user counts may differ slightly.