3
votes

I have a BigQuery table and I want to use a job with writeDisposition WRITE_TRUNCATE to overwrite the table with a subset of its rows. I am doing this because I'm trying to mimic a DELETE FROM … WHERE … operation.

Suppose while the job is running, I am simultaneously trying to stream rows into the table. Is it possible for rows to be inserted while the job is running and so be overwritten when the job completes? Or is there a locking mechanism that will prevent the rows from being inserted until the job finishes?

2

2 Answers

1
votes

In this case you need to stop the streaming jobs until you do your operation. And resume once you are done with it. There is no locking.

Also you should allow some cooling down period after you stop streaming inserts, as they are processed in background and you need to let the system to finish.

1
votes

Because of the table metadata caching layer in streaming system, it currently needs about 10 minutes to realize that a table has been truncated. During this ~10min, all streamed data will be dropped (because they are considered as part of truncated data).

As Pentium10 suggested, it's recommended to pause the streaming requests if you are doing a WRITE_TRUNCATE, and resume it ~10min after truncation is done.