3
votes

I'm using the python client to create tables via SQL as explained in the docs (https://cloud.google.com/bigquery/docs/tables) like so:

# from google.cloud import bigquery
# client = bigquery.Client()
# dataset_id = 'your_dataset_id'

job_config = bigquery.QueryJobConfig()
# Set the destination table
table_ref = client.dataset(dataset_id).table('your_table_id')
job_config.destination = table_ref
sql = """
    SELECT corpus
    FROM `bigquery-public-data.samples.shakespeare`
    GROUP BY corpus;
"""

# Start the query, passing in the extra configuration.
query_job = client.query(
    sql,
    # Location must match that of the dataset(s) referenced in the query
    # and of the destination table.
    location='US',
    job_config=job_config)  # API request - starts the query

query_job.result()  # Waits for the query to finish
print('Query results loaded to table {}'.format(table_ref.path))

This works well except that the client function for creating a table via SQL query uses a job_config object, and job_config receives a table_ref, not a table object.

I found this doc for creating tables with description here: https://google-cloud-python.readthedocs.io/en/stable/bigquery/usage.html, But this is for tables NOT created from queries.

Any ideas on how to create a table from query while specifying a description for that table?

1
orcaman not sure what you are trying to do, when you set a destination Table it's equivilant to create and you cant do create with destination Table in the same transaction. Can you please clarify your use case.Tamir Klein
@ElliottBrossard not helpful, I have to use the python client (automation)orcaman
@TamirKlein as I've written, I need to create from query (this part is done), plus, attach a description for the table that was generated (this part seems to be impossible with the python client). For instance "this table was created via agent #123"orcaman
You can issue DDL queries using the Python client, as Tamir showed you in his answer.Elliott Brossard

1 Answers

1
votes

Since you want to do more than only save the SELECT result to a new table the best way for you is not use a destination table in your job_config variable rather use a CREATE command

So you need to do 2 things:

  1. Remove the following 2 lines from your code
table_ref = client.dataset(dataset_id).table('your_table_id')   
job_config.destination = table_ref
  1. Replace your SQL with this
#standardSQL
CREATE TABLE dataset_id.your_table_id
PARTITION BY DATE(_PARTITIONTIME)
OPTIONS(
    description = 'this table was created via agent #123'
) AS
SELECT corpus
FROM `bigquery-public-data.samples.shakespeare`
GROUP BY corpus;