MY SITUATION:
I have written a piece of code that returns a dataset containing a web user's aggregated activity for the previous 90 days and returns a score, subsequent to some calculation. Essentially, like RFV.
A (VERY) simplified version of the code can be seen below:
WITH start_data AS (
SELECT user_id
,COUNT(web_visits) AS count_web_visits
,COUNT(button_clicks) AS count_button_clicks
,COUNT(login) AS count_log_in
,SUM(time_on_site) AS total_time_on_site
,CURRENT_DATE AS run_date
FROM web.table
WHERE TO_CHAR(visit_date, 'YYYY-MM-DD') BETWEEN DATEADD(DAY, -90, CURRENT_DATE) AND CURRENT_DATE
AND some_flag = 1
AND some_other_flag = 2
GROUP BY user_id
ORDER BY user_id DESC
)
The output might look something like the below:
| user_id | count_web_visits | count_button_clicks | count_log_in | total_time_on_site | run_date |
|---------|------------------|---------------------|--------------|--------------------|----------|
| 1234567 | 256 | 932 |16 | 1200 | 23-01-20 |
| 2391823 | 710 | 1345 |308 | 6000 | 23-01-20 |
| 3729128 | 67 | 204 |83 | 320 | 23-01-20 |
| 5561296 | 437 | 339 |172 | 3600 | 23-01-20 |
This output is then stored in it's own AWS/Redhsift table and will form base table for the task.
SELECT *
into myschema.base_table
FROM start_data
DESIRED OUTPUT:
What I need to be able to do, is iteratively run this code such that I append new data to myschema.base_table
, every day, for the previous 90's day aggregation.
The way I see it, I can either go forwards or backwards, it doesn't matter.
That is to say, I can either:
- Starting from today, run the code, everyday, for the preceding 90 days, going BACK to the (first date in the table + 90 days)
OR
- Starting from the (first date in the table + 90 days), run the code for the preceding 90 days, everyday, going FORWARD to today.
Option 2 seems the best option to me and the desired output looks like this (PARTITION FOR ILLUSTRATION ONLY):
| user_id | count_web_visits | count_button_clicks | count_log_in | total_time_on_site | run_date |
|---------|------------------|---------------------|--------------|--------------------|----------|
| 1234567 | 412 | 339 |180 | 3600 | 20-01-20 |
| 2391823 | 417 | 6253 |863 | 2400 | 20-01-20 |
| 3729128 | 67 | 204 |83 | 320 | 20-01-20 |
| 5561296 | 281 | 679 |262 | 4200 | 20-01-20 |
|---------|------------------|---------------------|--------------|--------------------|----------|
| 1234567 | 331 | 204 |83 | 3200 | 21-01-20 |
| 2391823 | 652 | 1222 |409 | 7200 | 21-01-20 |
| 3729128 | 71 | 248 |71 | 720 | 21-01-20 |
| 5561296 | 366 | 722 |519 | 3600 | 21-01-20 |
|---------|------------------|---------------------|--------------|--------------------|----------|
| 1234567 | 213 | 808 |57 | 3600 | 22-01-20 |
| 2391823 | 817 | 4265 |476 | 1200 | 22-01-20 |
| 3729128 | 33 | 128 |62 | 120 | 22-01-20 |
| 5561296 | 623 | 411 |283 | 2400 | 22-01-20 |
|---------|------------------|---------------------|--------------|--------------------|----------|
| 1234567 | 256 | 932 |16 | 1200 | 23-01-20 |
| 2391823 | 710 | 1345 |308 | 6000 | 23-01-20 |
| 3729128 | 67 | 204 |83 | 320 | 23-01-20 |
| 5561296 | 437 | 339 |172 | 3600 | 23-01-20 |
WHAT I HAVE TRIED:
I have successfully created a WHILE
loop to sequentially increment the date as follows:
CREATE OR REPLACE PROCEDURE retrospective_data()
LANGUAGE plpgsql
AS $$
DECLARE
start_date DATE := '2020-11-20' ;
BEGIN
WHILE CURRENT_DATE > start_date
LOOP
RAISE INFO 'Date: %', start_date;
start_date = start_date + 1;
END LOOP;
RAISE INFO 'Loop Statment Executed Successfully';
END;
$$;
CALL retrospective_data();
Thus producing the dates as follows:
INFO: Date: 2020-11-20
INFO: Date: 2020-11-21
INFO: Date: 2020-11-22
INFO: Date: 2020-11-23
INFO: Date: 2020-11-24
INFO: Date: 2020-11-25
INFO: Date: 2020-11-26
INFO: Loop Statment Executed Successfully
Query 1 OK: CALL
WHAT I NEED HELP WITH:
I need to be able to apply the WHILE
loop to the initial code such that the WHERE
clause becomes:
WHERE TO_CHAR(visit_date, 'YYYY-MM-DD') BETWEEN DATEADD(DAY, -90, start_date) AND start_date
But where start_date is the result of each incremental loop. Additionally, the result of each execution needs to be appended to the previous.
Any help appreciated.