To answer this question, I would like to first revisit Google's applicable definitions, and then run over the calculations (Last Revisited: July 2021).
Google gives us the following definitions:
GA4 - Automatically collected events
session_start
(app, web) - when a user engages the app or website
user_engagement
(app, web) - periodically, while the app is in the foreground or the webpage is in focus
With params: engagement_time_msec
GA4 - How the number of sessions is calculated
Sessions
: The number of sessions that began on your site or app (the session_start event was triggered).
App session timeout duration
: An app session begins to time out when an app is moved to the background, but you have the option to extend that session by including an extend_session parameter (with a value of 1) with events you send while the app is in the background. This is useful if your app is frequently used in the background, (e.g. as with navigation and music apps.) Change the default timeout of 30 minutes for app sessions via the setSessionTimeoutDuration
method.
Engaged sessions
: The number of sessions that lasted 10 seconds or longer, or had 1 or more conversion events or 2 or more page views.
GA4 Dashboard
Monthly (28-day), Weekly (7-day), and Daily (1-day) Active Users
for the date range, including fluctuation by percentage from the previous date range. An active user has engaged with an app in the device foreground, and has logged a user_engagement event.
Daily user engagement
- Average daily engagement per user for the date range, including the fluctuation by percentage from the previous date range.
My take on the definitions:
Based on the supporting GA4/Firebase documents, I (re-)summarized the definitions for each of the metrics below. It is very important to state that only the unique users should be counted over each of the metrics (given selected date range). No need to UNNEST
as we are already querying at the event_name
-level, not for example the event_parameter
-level.
- 1-day active users: A 1-day unique active user has engaged with an app in the device foreground AND has logged a
user_engagement
event within the last 1-day period (given selected date range).
- 7-day active users: A 7-day unique active user has engaged with an app in the device foreground AND has logged a
user_engagement
event within the last 7-day period (given selected date range).
- 28-day active users: A 28-day unique active user has engaged with an app in the device foreground AND has logged a
user_engagement
event within the last 28-day period (given selected date range).
In the cells below you can see how the metrics are calculated for December:
Methodology to Calculate Each Metric / Audience:
- Calculate DAUs for a specific month by using:
Average 1-day active user metric
.
- Calculate WAUs for a specific month by using:
Average 7-day active user metric
. I calculated this by averaging the snapshots at 7, 14, 21, 28 December.
- Calculate MAUs for a specific month by using:
Non-averaged 28-day active user metric
. The main reason for not averaging this metric's value is, because I want to have only one snapshot of the entire month. If I would have used averages here I would also account for users that were active in a previous month.
1.a) AVG 1-day Unique Active User Metric
# StandardSQL
SELECT
ROUND(AVG(users),0) AS users
FROM
(
SELECT
event_date,
COUNT(DISTINCT user_pseudo_id) AS users
FROM `<id>.events_*`
WHERE
event_name = 'user_engagement'
AND _TABLE_SUFFIX BETWEEN '20181201' AND '20181231'
AND platform = "ANDROID"
GROUP BY 1
) table
# or you could also use code below, but you will have to add in the remaining days' code to query against the entire month.
-- Set your variables here
WITH timeframe AS (SELECT DATE("2018-12-01") AS start_date, DATE("2018-12-31") AS end_date)
-- Query your variables here
SELECT ROUND(AVG(users),0) AS users
FROM
(
SELECT event_date, COUNT(DISTINCT user_pseudo_id) AS users
FROM `<id>.events_*`AS z, timeframe AS t
WHERE
event_name = 'user_engagement'
AND _TABLE_SUFFIX > FORMAT_DATE('%Y%m%d', DATE_ADD(t.end_date, INTERVAL - 1 DAY))
AND _TABLE_SUFFIX <= FORMAT_DATE('%Y%m%d', DATE_ADD(t.end_date, INTERVAL 0 DAY))
AND platform = "ANDROID"
GROUP BY 1
UNION ALL
SELECT event_date, COUNT(DISTINCT user_pseudo_id) AS users
FROM `<id>.events_*`AS z, timeframe AS t
WHERE
event_name = 'user_engagement'
AND _TABLE_SUFFIX > FORMAT_DATE('%Y%m%d', DATE_ADD(t.end_date, INTERVAL - 2 DAY))
AND _TABLE_SUFFIX <= FORMAT_DATE('%Y%m%d', DATE_ADD(t.end_date, INTERVAL - 1 DAY))
AND platform = "ANDROID"
GROUP BY 1
...
...
...
...
) avg_1_day_active_users
1.b) AVG 1-day Unique Active User Metric
A more recent version scheduled daily
to a BQ destination table daus_android_{run_time|"%Y%m%d"}
with write preference WRITE_APPEND
, could look like below. I did a previous deep-dive and determined it could take up to 48 hours for intraday-table events to propagate to permanent BQ tables (hence the - 3 days in the query).
with base AS (
SELECT *
FROM `<id>.analytics_<number>.events_*`
WHERE (_TABLE_SUFFIX >= FORMAT_DATE('%Y%m%d', DATE_ADD(@run_date, INTERVAL - 3 DAY)) AND _TABLE_SUFFIX < FORMAT_DATE('%Y%m%d', DATE_ADD(@run_date, INTERVAL - 2 DAY)))
AND platform = "ANDROID"
AND event_name = 'user_engagement'
), app AS (
SELECT
FORMAT_DATE('%Y%m%d', @run_date) AS _currentdate,
FORMAT_DATE('%Y%m%d', DATE_ADD(@run_date, INTERVAL - 3 DAY)) AS _begindate,
FORMAT_DATE('%Y%m%d', DATE_ADD(@run_date, INTERVAL - 2 DAY)) AS _enddate,
TIMESTAMP_DIFF(TIMESTAMP(DATE_ADD(@run_date, INTERVAL - 2 DAY)), TIMESTAMP(DATE_ADD(@run_date, INTERVAL - 3 DAY)), HOUR) AS _hoursdiff,
COUNT(DISTINCT user_pseudo_id) AS _uniqusers
FROM base
)
SELECT
app._currentdate,
app._begindate,
app._enddate,
app._hoursdiff,
app._uniqusers
FROM app;
1.c) AVG 1-day Unique Active User Metric
WITH app as (
SELECT
FORMAT_DATE('%Y%m%d', @run_date) AS _currentdate,
FORMAT_DATE('%Y%m%d', DATE_ADD(@run_date, INTERVAL - 3 DAY)) AS _begindate,
FORMAT_DATE('%Y%m%d', DATE_ADD(@run_date, INTERVAL - 2 DAY)) AS _enddate,
TIMESTAMP_DIFF(TIMESTAMP(DATE_ADD(@run_date, INTERVAL - 2 DAY)), TIMESTAMP(DATE_ADD(@run_date, INTERVAL - 3 DAY)), HOUR) AS _hoursdiff,
COUNT(DISTINCT user_pseudo_id) AS _uniqusers
FROM `<gcp-project>.analytics_<id>.events_*`
WHERE
platform = "ANDROID"
AND event_name = 'user_engagement'
AND _TABLE_SUFFIX >= FORMAT_DATE('%Y%m%d', DATE_ADD(@run_date, INTERVAL - 3 DAY))
AND _TABLE_SUFFIX < FORMAT_DATE('%Y%m%d', DATE_ADD(@run_date, INTERVAL - 2 DAY))
)
SELECT
app._currentdate,
app._begindate,
app._enddate,
app._hoursdiff,
app._uniqusers
FROM app
2.a) AVG 7-day Unique Active User Metric
-- Set your variables here
WITH timeframe AS (SELECT DATE("2018-12-01") AS start_date, DATE("2018-12-31") AS end_date)
-- Query your variables here
SELECT ROUND(AVG(users),0) AS users
FROM
(
SELECT COUNT(DISTINCT user_pseudo_id) AS users
FROM `<id>.events_*`AS z, timeframe AS t
WHERE
event_name = 'user_engagement'
AND _TABLE_SUFFIX > FORMAT_DATE('%Y%m%d', DATE_ADD(t.end_date, INTERVAL - 7 DAY))
AND _TABLE_SUFFIX <= FORMAT_DATE('%Y%m%d', DATE_ADD(t.end_date, INTERVAL 0 DAY))
AND platform = "ANDROID"
UNION ALL
SELECT COUNT(DISTINCT user_pseudo_id) AS users
FROM `<id>.events_*`AS z, timeframe AS t
WHERE
event_name = 'user_engagement'
AND _TABLE_SUFFIX > FORMAT_DATE('%Y%m%d', DATE_ADD(t.end_date, INTERVAL - 14 DAY))
AND _TABLE_SUFFIX <= FORMAT_DATE('%Y%m%d', DATE_ADD(t.end_date, INTERVAL - 7 DAY))
AND platform = "ANDROID"
UNION ALL
SELECT COUNT(DISTINCT user_pseudo_id) AS users
FROM `<id>.events_*`AS z, timeframe AS t
WHERE
event_name = 'user_engagement'
AND _TABLE_SUFFIX > FORMAT_DATE('%Y%m%d', DATE_ADD(t.end_date, INTERVAL - 21 DAY))
AND _TABLE_SUFFIX <= FORMAT_DATE('%Y%m%d', DATE_ADD(t.end_date, INTERVAL - 14 DAY))
AND platform = "ANDROID"
UNION ALL
SELECT COUNT(DISTINCT user_pseudo_id) AS users
FROM `<id>.events_*`AS z, timeframe AS t
WHERE
event_name = 'user_engagement'
AND _TABLE_SUFFIX > FORMAT_DATE('%Y%m%d', DATE_ADD(t.end_date, INTERVAL - 28 DAY))
AND _TABLE_SUFFIX <= FORMAT_DATE('%Y%m%d', DATE_ADD(t.end_date, INTERVAL - 21 DAY))
AND platform = "ANDROID"
) avg_7_day_active_users
2.b) AVG 7-day Unique Active User Metric
A more recent version scheduled daily
to a BQ destination table waus_android_{run_time|"%Y%m%d"}
with write preference WRITE_APPEND
, could look like:
with base AS (
SELECT *
FROM `<id>.analytics_<number>.events_*`
WHERE (_TABLE_SUFFIX >= FORMAT_DATE('%Y%m%d', DATE_ADD(@run_date, INTERVAL - 9 DAY)) AND _TABLE_SUFFIX < FORMAT_DATE('%Y%m%d', DATE_ADD(@run_date, INTERVAL - 2 DAY)))
AND platform = "ANDROID"
AND event_name = 'user_engagement'
), app AS (
SELECT
FORMAT_DATE('%Y%m%d', @run_date) AS _currentdate,
FORMAT_DATE('%Y%m%d', DATE_ADD(@run_date, INTERVAL - 9 DAY)) AS _begindate,
FORMAT_DATE('%Y%m%d', DATE_ADD(@run_date, INTERVAL - 2 DAY)) AS _enddate,
TIMESTAMP_DIFF(TIMESTAMP(DATE_ADD(@run_date, INTERVAL - 2 DAY)), TIMESTAMP(DATE_ADD(@run_date, INTERVAL - 9 DAY)), HOUR) AS _hoursdiff,
COUNT(DISTINCT user_pseudo_id) AS _uniqusers
FROM base
)
SELECT
app._currentdate,
app._begindate,
app._enddate,
app._hoursdiff,
app._uniqusers
FROM app;
3.a) Non-averaged 28-day Unique Active User Metric
# StandardSQL
-- Set your variables here
WITH timeframe AS (SELECT DATE("2018-12-01") AS start_date, DATE("2018-12-31") AS end_date)
-- Query your variables here
SELECT COUNT(DISTINCT user_pseudo_id) AS users
FROM `<id>.events_*`AS z, timeframe AS t
WHERE
event_name = 'user_engagement'
AND _TABLE_SUFFIX > FORMAT_DATE('%Y%m%d', DATE_ADD(t.end_date, INTERVAL - 28 DAY))
AND _TABLE_SUFFIX <= FORMAT_DATE('%Y%m%d', DATE_ADD(t.end_date, INTERVAL 0 DAY))
AND platform = "ANDROID"
3.b) Non-averaged 28-day Unique Active User Metric
A more recent version scheduled daily
to a BQ destination table maus_android_{run_time|"%Y%m%d"}
with write preference WRITE_APPEND
, could look like:
with base AS (
SELECT *
FROM `<id>.analytics_<number>.events_*`
WHERE (_TABLE_SUFFIX >= FORMAT_DATE('%Y%m%d', DATE_ADD(@run_date, INTERVAL - 30 DAY)) AND _TABLE_SUFFIX < FORMAT_DATE('%Y%m%d', DATE_ADD(@run_date, INTERVAL - 2 DAY)))
AND platform = "ANDROID"
AND event_name = 'user_engagement'
), app AS (
SELECT
FORMAT_DATE('%Y%m%d', @run_date) AS _currentdate,
FORMAT_DATE('%Y%m%d', DATE_ADD(@run_date, INTERVAL - 30 DAY)) AS _begindate,
FORMAT_DATE('%Y%m%d', DATE_ADD(@run_date, INTERVAL - 2 DAY)) AS _enddate,
TIMESTAMP_DIFF(TIMESTAMP(DATE_ADD(@run_date, INTERVAL - 2 DAY)), TIMESTAMP(DATE_ADD(@run_date, INTERVAL - 30 DAY)), HOUR) AS _hoursdiff,
COUNT(DISTINCT user_pseudo_id) AS _uniqusers
FROM base
)
SELECT
app._currentdate,
app._begindate,
app._enddate,
app._hoursdiff,
app._uniqusers
FROM app;
Side Notes:
- I know some companies still calculate their MAUs over a 30-day period. So you will have to test and see what works best for your company.
- You can calculate your own
DAU-to-MAU
-ratio or WAU-to-MAU
-ratio from above examples to determine your app's stickiness
- The only problem I have with the MAU-calculation, is that it does not yet take into account the starting days of each month. Perhaps one could take the average of Day31 - 28days, Day30 - 28days, Day29 - 28days, Day28 - 28days ...
- I found the Firebase Team's sample queries also helpful, but their active metrics only addresses the active user count at time when the query is executes (view example below):
SELECT
COUNT(DISTINCT user_id)
FROM
/* PLEASE REPLACE WITH YOUR TABLE NAME */
`YOUR_TABLE.events_*`
WHERE
event_name = 'user_engagement'
/* Pick events in the last N = 20 days */
AND event_timestamp > UNIX_MICROS(TIMESTAMP_SUB(CURRENT_TIMESTAMP, INTERVAL 20 DAY))
/* PLEASE REPLACE WITH YOUR DESIRED DATE RANGE */
AND _TABLE_SUFFIX BETWEEN '20180521' AND '20240131';
🥷 Ninja Tip 🥷
To transition your team's/company's/product's focus from Vanity Metrics
to Actionable Metrics
, consider adding one of your main conversion-events as part of the queries above (e.g. in_app_purchase
for e-commerce companies):
with base AS (
SELECT *
FROM `<id>.analytics_<number>.events_*`
WHERE (_TABLE_SUFFIX >= FORMAT_DATE('%Y%m%d', DATE_ADD(@run_date, INTERVAL - 3 DAY)) AND _TABLE_SUFFIX < FORMAT_DATE('%Y%m%d', DATE_ADD(@run_date, INTERVAL - 2 DAY)))
AND platform = "ANDROID"
# AND event_name = 'user_engagement'
AND event_name = 'in_app_purchase'
), app AS (
SELECT
FORMAT_DATE('%Y%m%d', @run_date) AS _currentdate,
FORMAT_DATE('%Y%m%d', DATE_ADD(@run_date, INTERVAL - 3 DAY)) AS _begindate,
FORMAT_DATE('%Y%m%d', DATE_ADD(@run_date, INTERVAL - 2 DAY)) AS _enddate,
TIMESTAMP_DIFF(TIMESTAMP(DATE_ADD(@run_date, INTERVAL - 2 DAY)), TIMESTAMP(DATE_ADD(@run_date, INTERVAL - 3 DAY)), HOUR) AS _hoursdiff,
COUNT(DISTINCT user_pseudo_id) AS _uniqusers
FROM base
)
SELECT
app._currentdate,
app._begindate,
app._enddate,
app._hoursdiff,
app._uniqusers
FROM app;