0
votes

I would like to automate a csv file extraction process from Google BigQuery to a Google Cloud Storage Bucket, and from the latter to an external server with two Python scripts, could you help me please? I would appreciate it.

2

2 Answers

1
votes

For extracting from BigQuery in Python, you can use the Python Client for Google BigQuery.

The below snippet based on this repository should get you going:

# client = bigquery.Client()
# bucket_name = 'my-bucket'
project = "bigquery-public-data"
dataset_id = "samples"
table_id = "shakespeare"

destination_uri = "gs://{}/{}".format(bucket_name, "shakespeare.csv")
dataset_ref = bigquery.DatasetReference(project, dataset_id)
table_ref = dataset_ref.table(table_id)

extract_job = client.extract_table(
    table_ref,
    destination_uri,
    # Location must match that of the source table.
    location="US",
)  # API request
extract_job.result()  # Waits for job to complete.

print(
    "Exported {}:{}.{} to {}".format(project, dataset_id, table_id, destination_uri)
)

In order to post the export to another server, you can use the Cloud Storage Client Library for Python to post the CSV file to your server or service of choice.

-1
votes

As per my knowledge, BigQuery can't export/download query result to GCS or Local File. You can keep it in a temporary / stagging table and then use code like below to export to gcs:

https://cloud.google.com/bigquery/docs/exporting-data#exporting_table_data

So you can put this in a container and deploy it as cloudrun service and call this from cloud scheduler.