1
votes

I'm working on BigQuery with Python Client Library V0.28. I would like to insert table query results into streaming table (one partition by day).

I have 2 tables: - Table_A contains my source data - table_B will be enriched from table_A after some processing (table_B_20101001, table_B_20101002, ...).

I went through the documentation but I did not find examples. Can someone can help me?

Many thanks!

1

1 Answers

3
votes

From what you described it seems like your table B is not actually partitioned but rather it's aggregated by suffixed dates.

One thing you could do is to run a query and set it up to save the results into the tables you want, like so:

import os
from google.cloud.bigquery import Client, job
os.environ['GOOGLE_APPLICATION_CREDENTIALS']='path/to/your/credentials.json'

config = job.QueryJobConfig()
config.write_disposition = 'WRITE_APPEND'

dataset = bc.dataset('name of dataset where table_B is located')
table = dataset.table('table_B_20101001')
config.destination = table

query = """select (make the data transformations you want) FROM table_A"""
query_job = bc.query(query, config)

This script queries source table_A, makes the data transformations you want and saves the results into table_B_20101001 (change it accordingly).

The operation appends results there, if you want to replace its content you can set config.write_disposition = 'WRITE_TRUNCATE'.

You said something about streaming to table_B though, I think you should only use this option if the one I presented is not enough for you as this approach will be more expensive and the operation will take a bit longer.

Basically, you'd have to use the bc.create_rows method as described in the docs and set rows to be the query results from your job query.