2
votes

I have 1,512 active users in the Firebase Dashboard for this date range Dec 18, 2018 to Dec 21, 2018.

I executed this query and got the following:

SELECT
 COUNT(DISTINCT user_pseudo_id ) as active_user_count
FROM
  -- PLEASE REPLACE WITH YOUR TABLE NAME.
  `xxxxxx-xxxxx.analytics_000000000.events_*`
WHERE
  event_name = 'user_engagement'
  AND event_timestamp >
      UNIX_MICROS(TIMESTAMP_SUB(CURRENT_TIMESTAMP, INTERVAL 200 DAY))
  -- PLEASE REPLACE WITH YOUR DESIRED DATE RANGE.
  AND _TABLE_SUFFIX BETWEEN '20181218' AND '20181221'

I get 1326 active users which are considerably off from my Firebase Dashboard.

But when I include event_date I seem to get a more accurate count as show below:

SELECT
SUM(active_user_count)
FROM(

SELECT
 COUNT(DISTINCT user_pseudo_id ) as active_user_count,
 event_date
FROM
  -- PLEASE REPLACE WITH YOUR TABLE NAME.
  `xxxxxxx-xxxxxx.analytics_xxxx.events_*`
WHERE
  event_name = 'user_engagement'
  -- Pick events in the last N = 20 days.
  AND event_timestamp >
      UNIX_MICROS(TIMESTAMP_SUB(CURRENT_TIMESTAMP, INTERVAL 200 DAY))
  -- PLEASE REPLACE WITH YOUR DESIRED DATE RANGE.
  AND _TABLE_SUFFIX BETWEEN '20181218' AND '20181221'
  GROUP BY event_date)

My user count is 1497. I do not understand why if I group by date and sum I get a more accurate number.

Should the first query get me the same number as the 2nd query?

Thanks,

1

1 Answers

1
votes

I do not understand why if I group by date and sum I get a more accurate number.
Should the first query get me the same number as the 2nd query?

The short answer is - No, it should not!!

What first query does is - it calculates number of distinct users/accounts within whole period. So if same user/account was active in more than one day - this user/account will be counted only once!

Second query works quite differently - it first calculates active user for each and every day - so same user (in above example) will be counted separately for each day - so then when finally all these daily counts are SUM'ed - that user being counted as many times as inhow many days that user was active - thus the difference

Hope this helps you for further troubleshooting

Meantime, i would speculate on definition of active users in Firebase Dashboard - as it seems like they use follow logic of your second query (vs the first one) - but this one is my wild guess