I'm trying to look at user activity by date. The first step is to build a table of every day since a user account was created, using cross join and a where clause. My first attempt was this:
SELECT
u.user_id as user_id,
date(u.created) as signup_date,
cal.date as date,
from rsdw.user u
cross join (select date(dt) as date from [rsdw.calendar] where date(dt) < CURRENT_DATE() ) cal
where
date(u.created) <= cal.date
(The calendar table is just a list of all dates since 2006 (3288 rows). The user table has ~1m rows.)
This query takes a forever... so long that I've abandoned it at 1000 seconds or so. I tried tweaking the query a little. If I add an "each" to the cross join:
SELECT
u.user_id as user_id,
date(u.created) as signup_date,
cal.date as date,
from rsdw.user u
cross join each (select date(dt) as date from [rsdw.calendar] where date(dt) < CURRENT_DATE() ) cal
where
date(u.created) <= cal.date
I get an errror:
Error: Cannot CROSS JOIN two tables with EACH qualifiers.
Finally, if I keep the "each" but swap the tables, it completes in just 90s!
SELECT
u.user_id as user_id,
date(u.created) as signup_date,
cal.date as date,
from (select date(dt) as date from [rsdw.calendar] where date(dt) < CURRENT_DATE() ) cal
cross join each rsdw.user u
where
date(u.created) <= cal.date
Can anyone explain why the third iteration is so much faster and why the second one results in an error?