8
votes

Here is a code I am using as a reference from https://cloud.google.com/bigquery/docs/managing-tables#bigquery-copy-table-python:

source_dataset = client.dataset('samples', project='bigquery-public-data')
source_table_ref = source_dataset.table('shakespeare')

# dataset_id = 'my_dataset'
dest_table_ref = client.dataset(dataset_id).table('destination_table')

job = client.copy_table(
    source_table_ref,
    dest_table_ref,
    # Location must match that of the source and destination tables.
    location='US')  # API request

job.result()  # Waits for job to complete.

In my case the destination table exists and I get this error:

Already Exists

How can I overwrite with this copy_table method? In the bq commandline, I can use -f option. So I am looking for the same flag.

2

2 Answers

21
votes

You need to pass the job config to the request like this:

job_config = bigquery.CopyJobConfig()
job_config.write_disposition = "WRITE_TRUNCATE"
job = client.copy_table(
    source_table_ref,
    dest_table_ref,
    location='US',
    job_config=job_config)  # API request

More docs here: https://googleapis.github.io/google-cloud-python/latest/bigquery/generated/google.cloud.bigquery.client.Client.copy_table.html

7
votes

See the linked docs:

configuration.copy.writeDisposition: Specifies the action that occurs if the destination table already exists.

The following values are supported:

  • WRITE_TRUNCATE: If the table already exists, BigQuery overwrites the table data.
  • WRITE_APPEND: If the table already exists, BigQuery appends the data to the table.
  • WRITE_EMPTY: If the table already exists and contains data, a 'duplicate' error is returned in the job result.

The default value is WRITE_EMPTY.