I am just starting to learn SQL and face the following issue:
I have a web-site with registered customers who visit it monthly. There is no subscription of any kind so I just have a table that collects events of my customers' activity (userId, action, timestamp).
I can easily check how many unique (DISTINCT userId) I have in January vs February, but what I want to learn is - how many users that were active in January (as seen by their actions in the table) are still using the site in February? so its like a churn but I can't wrap my head around whats the best approach here.
What I am thinking of as possible solution:
- Get list of userIds from January:
SELECT DISTINCT user_id
FROM site_activity
WHERE time_utc BETWEEN '2022/01/01' AND '2022/01/31'
- Get list of userIds from February
SELECT DISTINCT user_id
FROM site_activity
WHERE time_utc BETWEEN '2022/02/01' AND '2022/02/28'
- Somehow check for overlap and mark each February user that exists in January table as "retained" and mark anyone else as "new" or "resuming" (maybe they were active in Dec last year).
And here I thought of writing something with WHILE?
Am I on the right track here?