1
votes

My python app is storing result data in BigQuery. In the code I am generating JSON that reflects target BQ table structure and then insert it.
Generally it works fine, but fails to save rows, which size exceeds 1 MB. This is an limitation of using streaming inserts.
I checked Google API documentation: https://googleapis.dev/python/bigquery/latest/index.html
It seems, that Client methods like insert_rows or insert_rows_json are using insertAll method underneath - which uses streaming mechanism.
Is there a way to invoke "standard" BigQuery insert from python code to insert row larger than 1MB? It would be rather rare occurrence, so I am not concerned about quotas regarding daily table insert count limit.

1

1 Answers

2
votes

The Client library cannot go around the API limits. See current quotas, a row as of this writing cannot be larger than 1MB.

The workaround we used is to save records in NJSON to GCS in 100MB batches - we use gcsfs library - and then execute a bq.load() job.

I have actually just logged a feature request here to increase the limit as this is very limiting. If interested, make sure to "star" it to gain traction.