I have been running a query of the format below
SELECT b.date as Date,COUNT(DISTINCT user_id) AS NewUsers FROM (
SELECT user_id,MIN(date) as min_date
FROM tableA
WHERE date >= '2018-10-10'
AND filter1 = "XYZ"
GROUP BY ) a
CROSS JOIN (
SELECT date FROM tableB
WHERE date >= '2018-10-19' AND date <= CURRENT_DATE()
GROUP BY 1) b
WHERE a.date >= DATE_SUB(b.date, INTERVAL 6 DAY) AND a.date <= b.date
GROUP BY 1
Let's say the above is result1
SELECT b.date as Date,COUNT(DISTINCT user_id) AS NewUsers FROM (
SELECT user_id,MIN(date) as min_date
FROM tableA
WHERE date >= '2018-07-10'
AND filter1 = "XYZ"
GROUP BY ) a
CROSS JOIN (
SELECT date FROM tableB
WHERE date >= '2018-07-19' AND date <= CURRENT_DATE()
GROUP BY 1) b
WHERE a.date >= DATE_SUB(b.date, INTERVAL 6 DAY) AND a.date <= b.date
GROUP BY 1
The above is result2
Here 2018-07-19 is the launch date.
Since I have the data till 2018-10-19, I want to run the query from the later date to optimize the cost and the data consumption by the query....but some how, I am getting incorrect data.
But, if I run the same query from the launch date, I am getting the correct results.
I mean the NewUsers from result1 for the corresponding dates (like date >= 2018-10-19) are more than the NewUsers from result2.
No sure, where I am missing something.
Any help would be greatly appreciated.
Thanks