0
votes

When I try to load from GCS files into a BigQuery table, it fails with this error (by the way I use python):

Forbidden: 403 Exceeded rate limits: too many table update operations for this table. For more information, see https://cloud.google.com/bigquery/troubleshooting-errors

There are about 10 files from GCS to load for each table, but when I run this 3 times within a day, I see the error above.

I checked this page, too, but I still don't know what is going on: https://cloud.google.com/bigquery/quotas#standard_tables

To provide a little bit more details, here is some portion of python:


    job_config = bigquery.LoadJobConfig()
    job_config.schema = SCHEMA
    job_config.source_format = bigquery.SourceFormat.NEWLINE_DELIMITED_JSON
    job_config.write_disposition = 'WRITE_APPEND'

    # This for loop runs about 10 times for a table_ref,
    # namely there are about 10 territory in territories
    load_jobs = []
    for territory in territories:
        gsc_uri = f"gs://my-bucket/path/to/file_{date}_{territory}.txt"
        load_job = bigquery_client.load_table_from_uri(
            gcs_uri, table_ref, job_config=job_config
        )
        load_job.territory = territory
        load_jobs.append(load_job)
        print(f"Starting job {territory} {load_job.job_id}")

    for load_job in loadjobs:
        load_job.result()
        print(f"Job finished {load_job.territory}.")

Thanks!

1
Maybe you have more files than you think you do? Check your job history in the UI. You can combine multiple files in a single load with e.g. gs://my_bucket/files*.avro.Elliott Brossard
hmm, I actually print out when a load job starts and ends like the (updated) code above. and there is obviously about 10 repetitions for a ref_table. but I didn't know we can use * which is very useful. I'm trying that and will come back here with the result.untidyhair
* did help me here FYI.untidyhair
Great! Consider posting an answer to your question showing how you solved it.Elliott Brossard
Yeah, thanks! I was kinda reluctant because I still don't know why the original loop exceeded (and on which limit), but using * could help people so I'm adding it as an answer.untidyhair

1 Answers

0
votes

It's still not clear why I hit the rate limit, but what @Elliott Brossard suggested helped me in my case.

So instead of doing this:

for territory in territories:
    gsc_uri = f"gs://my-bucket/path/to/file_{date}_{territory}.txt"

Simply I was able to do this:

gsc_uri = f"gs://my-bucket/path/to/file_{date}_*.txt"

Which not just solved the rate limit issue, but it also sped up the loading!