0
votes

I have a snowflake Javascript stored procedure that takes in 2 arguments and performs runtime dynamic aggregation based on the columns selected from UI

Adding more context below

  • 1 argument for selected columns (selected filter/dropdown attribute or column name sent from UI)
  • 2 argument for the dynamic where clause prepared for the similar dropdown values selected in point 1

Data is fetched from a view and result is retrieved in this fashion

CREATE OR REPLACE PROCEDURE database.schema.sp_sample(dynamic_columns VARCHAR, dynamic_where_clause VARCHAR)
RETURNS VARCHAR
LANGUAGE JAVASCRIPT
EXECUTE AS CALLER
AS 
$$
rs="Success";
try {
retrieve_queries_sql = `SELECT COL_1, COL_2, ${DYNAMIC_COLUMNS} FROM view ${DYNAMIC_WHERE_CLAUSE} GROUP BY COL_1, COL_2, ${DYNAMIC_COLUMNS}`;
var stmt = snowflake.createStatement( {sqlText: retrieve_queries_sql, binds:[DYNAMIC_COLUMNS, DYNAMIC_WHERE_CLAUSE, DYNAMIC_COLUMNS]} );
var rs = stmt.execute();
}
catch(err) {
rs= "Failed Message: "+err.message;
}
return rs;
$$;

CALL "database"."schema"."sp_sample"('COL_3','WHERE COL_3\=\'somevalue\'');
SELECT * FROM TABLE(RESULT_SCAN(LAST_QUERY_ID(-2)));

Note: There is no requirement at this point to show data specific to a certain user. Also ignore the argument interpolation part in the code above

Questions:

  1. Is there a concern if this snowflake dynamic stored procedure is called by this UI application (multiple users concurrently selecting values on the UI and this dynamic procedure internally will be called many times) and its a read operation (select only just fetch the aggregated values for the dropdown hierarchies / levels that the users chooses)

  2. Will there be any question of data integrity where different users will not see updated values on the UI (to provide more context data is fetched from a view within this procedure and this view data will not change during the application live / active time - all batch will be completed before the application is accessed by users). So the data is ready and no changes to the data are expected to happen when users will be accessing the application

  3. Got a suggestion from someone that the dynamic procedure will cause data integrity concern where users will not be able to see updated data and to resolve that data integrity problem pre-aggregated views (meaning data is pre-aggregated at certain levels and kept in different views) will help and data consistency will be better.

Wanted to understand how this proposal of pre-aggregated views can help to solve if ever the problem is of data integrity (in the first place why there will be a data integrity concern) because what was done in the procedure is fetching the data from a view (dynamically aggregated) and even in this latter approach we are still creating many other views (pre-aggregated) eventually we are getting data from views in both cases, How does the latter cannot cause a data integrity concern?

Please share some feedback if the points provided makes any sense. Would like to know better opinions

1

1 Answers

0
votes
  1. No, there is no concern. The LAST_QUERY_ID returns the ID of a specified query in the current session.

https://docs.snowflake.com/en/sql-reference/functions/last_query_id.html

  1. No, there won't be any question of data integrity - It's not different then executing a complex SQL directly. You may check isolation level:

https://docs.snowflake.com/en/sql-reference/transactions.html#isolation-level

  1. I didn't understand how it helps to data integrity. In fact, if you use this approach, you may have data integrity issues if you combine this pre-aggregated data with the up-to-date data. Maybe they are the same person who thinks Snowflake is a Hadoop variant? :)