0
votes

I want to create a BigQuery table with Airflow(via bigquery hook or bigquery empty table creator). Unfortunately, it's not possible to create with Range partition.

Someone raised a PR, but to minimize the airflow operator interface, they suggest to skip it. And they gave a workaround like this.

you can use the table_resource argument to pass any table definition you want, no need to specify every single parameter.

Its not clear to me, in airflow how can I use the Range partition-related JSON. Can someone give me an example of how it can be used/implemented?

I tried the following thing, but it creates the table without partition.(its for time partition, but BQ operator has a parameter for this, but I want to give a try with table_resource.

resource="""{
    "TimePartitioning": {
        "type": "DAY",
        "field": "created_at"
    }
}"""
schema="""[{"mode": "REQUIRED", "name": "code", "type": "STRING"}, 
{"mode": "REQUIRED", "name": "created_at", "type": "DATETIME"}, 
{"mode": "REQUIRED", "name": "service_name", "type": "STRING"}]"""

def bq_create(**kwargs):
    table_schema = 'bhuvi'
    table_name = 'sampletable'
    create = BigQueryCreateEmptyTableOperator (
        task_id='create_bq_{}'.format(table_name),
        project_id = 'myproject',
        dataset_id=table_schema,
        table_id=table_name,
        schema_fields=json.loads(schema),
        bigquery_conn_id='bigquery_default',
        table_resource =json.loads(resource)
        )
    create.execute(context=kwargs)

bqcreate = PythonOperator(
            task_id="bqcreate",
            python_callable=bq_create,
            provide_context=True,
            dag=dag
        )
bqcreate
1

1 Answers

0
votes

According to documentation the right key for time partitioning is timePartitioning, isn't that the issue? https://cloud.google.com/bigquery/docs/reference/rest/v2/tables#Table