I am frequently running into problems where Snowflake query optimization seems to break UDFs as soon as I try to invoke them inside more complex queries. To try and find commonality in my missteps, I've distilled to a simple test javascript UDTF (called generate_series) that takes 3 doubles as arguments, and returns a table with a single column also of type double. This is able to run okay:
WITH report_params AS (
SELECT
1::double as first_value,
3::double as last_value,
1::double AS step_value
)
SELECT
*
FROM
report_params,
LATERAL generate_series(
first_value,
last_value,
step_value
)
(Side discovery if I use the LATERAL keyword on the cross join I don't need to encapsulate the UDTF in a table() cast. Not sure if there is rhyme or reason to this?)
Now, if I try to invoke the UDTF using sub-queries as parameters, which seems to be in compliance with the described snowflake subquery limitations, like so:
WITH report_params AS (
SELECT
1::double as first_value,
3::double as last_value,
1::double AS step_value
)
SELECT
*
FROM
table(
generate_series(
(SELECT MAX(first_value) FROM report_params),
(SELECT MAX(last_value) FROM report_params),
(SELECT MAX(step_value) FROM report_params)
)
)
I get a compilation error:
SQL compilation error: syntax error line 12 at position 49 unexpected ','. syntax error line 15 at position 6 unexpected ')'.
That is problem #1 I have, that fails regardless, though it should be valid SQL. My blind guess is the compiler misinterprets this as a correlated subquery.
Problem #2 is I then have a tiny wrapper UDTF that overloads the arguments and return values as integers (since JS-UDFs can only accept doubles)
CREATE OR REPLACE FUNCTION generate_series(FIRST_VALUE INTEGER, LAST_VALUE INTEGER, STEP_VALUE INTEGER)
RETURNS TABLE (GS_VALUE INTEGER)
AS
$$
SELECT GS_VALUE::INTEGER AS GS_VALUE FROM table(generate_series(FIRST_VALUE::DOUBLE,LAST_VALUE::DOUBLE,STEP_VALUE::DOUBLE))
$$;
Calling this overloaded function directly with constants e.g.
SELECT * FROM table(generate_series(1::integer,3::integer,1::integer))
this wrapper UDTF still works okay.
However, when I modify the previous working query (first variation) by just changing the param value types to integer to invoke the wrapper UDTF, e.g.
WITH report_params AS (
SELECT
1::integer as first_value,
3::integer as last_value,
1::integer AS step_value
)
SELECT
*
FROM
report_params, table(
generate_series(
first_value,
last_value,
step_value
)
)
I get the error
SQL compilation error: Unsupported subquery type cannot be evaluated
This error is the same in either the form directly above, or removing the table CAST and using the LATERAL keyword as presented in the first working form.
I have read in past threads that people have removed the SELECT...FROM portion of SQL UDTFs with mixed success to solve the problem, but none of the iterations I tried seemed to work.
It seems a gentle breeze can send the Snowflake SQL compiler on tilt, not sure if there is a debugging mode where I can observe the "deconstructed" form of the query that it's actually trying to run, and see where the problem is.
EDIT: this is the underlying Javascript UDTF referenced in these examples:
CREATE OR REPLACE FUNCTION generate_series(FIRST_VALUE DOUBLE, LAST_VALUE DOUBLE, STEP_VALUE DOUBLE)
RETURNS TABLE (GS_VALUE DOUBLE)
LANGUAGE JAVASCRIPT
AS '{
processRow: function get_params(row, rowWriter, context){
if(row.STEP_VALUE === 0) {
throw new Error("Step cannot be 0");
}
else if(row.STEP_VALUE > 0 && row.FIRST_VALUE > row.LAST_VALUE) {
throw new Error("Positive series is open-ended.");
}
else if(row.STEP_VALUE < 0 && row.FIRST_VALUE < row.LAST_VALUE) {
throw new Error("Negative series is open-ended.");
}
else {
let i = row.FIRST_VALUE;
while(row.STEP_VALUE > 0 ? i <= row.LAST_VALUE : i >= row.LAST_VALUE) {
rowWriter.writeRow({GS_VALUE: i});
i+=row.STEP_VALUE;
}
}}
}';