Let me preface by saying that I know this is not a particularly efficient or elegant piece of code. I am querying from a temp table called INSIDE made like so:
CREATE TEMP TABLE INSIDE (CONNECT_DATE DATE, DAILY_CONNECTIONS INT);`
I'm then attempting to run the following query on INSIDE in an attempt to test a model I've been working on.
SELECT *
, q5.DAN_PREDICTION - q5.LINEAR_PREDICTION AS PREDICTION_COMPARISON
, q5.DAN_PREDICTION - q5.ACTUAL_MONTH_END_AMOUNT AS DAN_VARIANCE
, q5.LINEAR_PREDICTION - q5.ACTUAL_MONTH_END_AMOUNT AS LINEAR_VARIANCE
FROM (SELECT *
, q4.mtd + q4.last_yr_remainder + q4.run_rate * q4.days_remaining AS DAN_PREDICTION
, q4.mtd + q4.curr_yr_7_day * days_remaining AS LINEAR_PREDICTION
FROM(
SELECT
*
, q3.curr_yr_7_day - q3.last_yr_7_day AS RUN_RATE
FROM(
SELECT
CONNECT_DATE
, DAILY_CONNECTIONS
, (cur_yr_1_prev + cur_yr_2_prev + cur_yr_3_prev + cur_yr_4_prev + cur_yr_5_prev + cur_yr_6_prev + cur_yr_7_prev)/7 AS CURR_YR_7_DAY
, (last_yr_1_prev + last_yr_2_prev + last_yr_3_prev + last_yr_4_prev + last_yr_5_prev + last_yr_6_prev + last_yr_7_prev)/7 AS LAST_YR_7_DAY
, (SELECT ISNULL(SUM(ins.DAILY_CONNECTIONS), 0)
FROM INSIDE ins
WHERE DATEPART(MONTH, ins.CONNECT_DATE) = DATEPART(MONTH, q2.CONNECT_DATE)
AND DATEPART(YEAR, ins.CONNECT_DATE) = DATEPART(YEAR, q2.CONNECT_DATE)
AND ins.CONNECT_DATE <= q2.CONNECT_DATE) AS MTD
, (SELECT ISNULL(SUM(ins.DAILY_CONNECTIONS), 0)
FROM INSIDE ins
WHERE DATEPART(MONTH, ins.CONNECT_DATE) = DATEPART(MONTH, q2.CONNECT_DATE)
AND DATEPART(YEAR, ins.CONNECT_DATE) = DATEPART(YEAR, q2.CONNECT_DATE)-1
AND ins.CONNECT_DATE > DATEADD(YEAR, -1, q2.CONNECT_DATE)) AS LAST_YR_REMAINDER
, (SELECT TOP 1 DATEPART(DAY, last_day(CONNECT_DATE))
FROM INSIDE
WHERE CONNECT_DATE = q2.CONNECT_DATE)-DATEPART(DAY, q2.CONNECT_DATE) DAYS_REMAINING
, (SELECT ISNULL(SUM(ins.DAILY_CONNECTIONS), 0)
FROM INSIDE ins
WHERE DATEPART(MONTH, ins.CONNECT_DATE) = DATEPART(MONTH, q2.CONNECT_DATE)
AND DATEPART(YEAR, ins.CONNECT_DATE) = DATEPART(YEAR, q2.CONNECT_DATE)) AS ACTUAL_MONTH_END_AMOUNT
FROM
(SELECT
q1.CONNECT_DATE CONNECT_DATE
, q1.DAILY_CONNECTIONS DAILY_CONNECTIONS
, ISNULL((SELECT DAILY_CONNECTIONS FROM INSIDE WHERE CONNECT_DATE = DATEADD(DAY,-1,q1.connect_date)), 0) CUR_YR_1_PREV
, ISNULL((SELECT DAILY_CONNECTIONS FROM INSIDE WHERE CONNECT_DATE = DATEADD(DAY,-2,q1.connect_date)), 0) CUR_YR_2_PREV
, ISNULL((SELECT DAILY_CONNECTIONS FROM INSIDE WHERE CONNECT_DATE = DATEADD(DAY,-3,q1.connect_date)), 0) CUR_YR_3_PREV
, ISNULL((SELECT DAILY_CONNECTIONS FROM INSIDE WHERE CONNECT_DATE = DATEADD(DAY,-4,q1.connect_date)), 0) CUR_YR_4_PREV
, ISNULL((SELECT DAILY_CONNECTIONS FROM INSIDE WHERE CONNECT_DATE = DATEADD(DAY,-5,q1.connect_date)), 0) CUR_YR_5_PREV
, ISNULL((SELECT DAILY_CONNECTIONS FROM INSIDE WHERE CONNECT_DATE = DATEADD(DAY,-6,q1.connect_date)), 0) CUR_YR_6_PREV
, ISNULL((SELECT DAILY_CONNECTIONS FROM INSIDE WHERE CONNECT_DATE = DATEADD(DAY,-7,q1.connect_date)), 0) CUR_YR_7_PREV
, ISNULL((SELECT DAILY_CONNECTIONS FROM INSIDE WHERE CONNECT_DATE = DATEADD(YEAR, -1,(DATEADD(DAY,-1,q1.connect_date)))), 0) LAST_YR_1_PREV
, ISNULL((SELECT DAILY_CONNECTIONS FROM INSIDE WHERE CONNECT_DATE = DATEADD(YEAR, -1,(DATEADD(DAY,-2,q1.connect_date)))), 0) LAST_YR_2_PREV
, ISNULL((SELECT DAILY_CONNECTIONS FROM INSIDE WHERE CONNECT_DATE = DATEADD(YEAR, -1,(DATEADD(DAY,-3,q1.connect_date)))), 0) LAST_YR_3_PREV
, ISNULL((SELECT DAILY_CONNECTIONS FROM INSIDE WHERE CONNECT_DATE = DATEADD(YEAR, -1,(DATEADD(DAY,-4,q1.connect_date)))), 0) LAST_YR_4_PREV
, ISNULL((SELECT DAILY_CONNECTIONS FROM INSIDE WHERE CONNECT_DATE = DATEADD(YEAR, -1,(DATEADD(DAY,-5,q1.connect_date)))), 0) LAST_YR_5_PREV
, ISNULL((SELECT DAILY_CONNECTIONS FROM INSIDE WHERE CONNECT_DATE = DATEADD(YEAR, -1,(DATEADD(DAY,-6,q1.connect_date)))), 0) LAST_YR_6_PREV
, ISNULL((SELECT DAILY_CONNECTIONS FROM INSIDE WHERE CONNECT_DATE = DATEADD(YEAR, -1,(DATEADD(DAY,-7,q1.connect_date)))), 0) LAST_YR_7_PREV
FROM INSIDE q1 ORDER BY q1.CONNECT_DATE
) q2 ORDER BY q2.connect_date
) q3
) q4
) q5
Running the inner q1 query seems to work just fine; the problem starts when I run the subqueries in q2. Running any more than one of them (MTD, LAST_YR_REMAINDER, etc) at a time produces the following error:
Amazon Invalid operation: This type of correlated subquery pattern is not supported due to internal error;
I've been reviewing the documentation for unsupported subquery types in Redshift and don't understand which rule these are breaking. Any help would be greatly appreciated.
connect_date daily_connections: 2016-05-20 867I calculate a year over year run-rate by finding the difference between the average number of connections over the past seven days from a date and the seven day average of the past year. I then add the number of connections for the month on the date in question to the remainder of the connections for the same month last year, plus the run rate multiplied by the number of days left in the month from the connect date. The final step (q5) is just comparing that result to a few things. - Grant Hofmann