3
votes

We have a working prototype in Google Cloud DataLab where we have made extensive use of the %%bq 'magics' to build up a parameterized query using various UDFs.

So something like:

customFn1 = bq.UDF("customFn1",theJsCode1,theDataTypeDef1,params=paramDefs1)
customFn2 = bq.UDF("customFn2",theJsCode2,theDataTypeDef2,params=paramDefs2)

then:

%%bq query --name finalQry --udfs customFn1 customFn2
SELECT 
     customFn1(bell, whistle) AS Output1,
     customFn2(rank,serialNo) AS Output2
FROM
     myImaginaryTable
WHERE
    id < @param1 AND id > @param2

then:

%%bq sample -q finalQry 
parameters:
   - name: param1
     type: STRING
     value: "ab1"
   - name: param2
     type: STRING
     value: "ab3"

We'd like to start working towards deploying this prototype and need to access the final query text to use in our production code (which will, in due course, be called through the API).

I know that GC datalab is doing some work behind the scenes to compose this query (our real world example is much more complex) which I need to replicate in our code.

it would be really useful if I could get access the actual query string GC datalab generates after it has composed the udfs, replaced the parameters, etc. ? I could do it manually with a shedload of copying and pasting, etc. but it would be super-useful if I could automatically access this final query string somewhere?

I have had a read at the docs here: http://googledatalab.github.io/pydatalab/datalab.bigquery.html but cannot seem to find what I need. Can anyone help?

1

1 Answers

3
votes

Have you tried finalQry.sql? This will give you the query's SQL string. However, this will be before any parameter substitution, since this happens at execution time (your sample call in this case).

See http://googledatalab.github.io/pydatalab/datalab.bigquery.html#datalab.bigquery.Query.sql

If you think getting the SQL string after substitution is a useful feature, you can open a feature request on the Github repo.