We are writing code to 'augment' our BigQuery tables with additional metadata in an automated way by using the Python API
We do this by using the key-value labeling system (works fine), and updating the column descriptions by:
- Per table, iterate over the schema columns
- Do something with the column metadata
- Add the column to a new schema
- Update the schema
However, the last step gives an error when working with tables that having RECORD type fields. When trying to update the schema I receive the following error on the 1st nested field inside the RECORD field:
SchemaField(u'nestedField', u'integer', u'NULLABLE', None, ()) is not JSON serializable
The code looks like this:
orig_table = bigquery_client.get_table(table)
schema = list(orig_table.schema)
new_schema = list()
for column in schema:
columntxt = *do something to define the column metadata*
new_schema.append(bigquery.SchemaField(column.name, column.field_type, column.fields, description=columntxt ))
orig_table.schema = new_schema
table = bigquery_client.update_table(orig_table, ['schema', 'labels'])
How to properly resolve this / make the code work with nested schema tables?