0
votes

We have been using streaming inserts on two datasets, one with date-partitioned tables and one with non-date-partitioned. They have been working perfectly fine for a very long time (over a year).

Recently we have noticed that BigQuery responds with 503 errors more frequently. Anywhere from 15% to 50% of the time.

In trying to analyze the issue, we built test clients that did the job of streaming just 1 record and used the template-suffix option. We see a very weird behavior that we have not seen before. Here are the scenarios:

  1. templateSuffix='xxx' and tableId='sessions_', where sessions_xxx does not exist. This works!
  2. templateSuffix='yyy' and tableId='sessions_', where sessions_yyy exists from a while ago. This fails! (503)
  3. templateSuffix='xxx' and tableId='sessions_', where sessions_xxx was created by the first test. This works!

We see the same behavior when using the API Explorer too.

For some unknown reason, our production system still only fails at the rates mentioned earlier, but the tests fail consistently. The production system is located at a data center, but the tests are being run from our offices.

Looks like 503 is the catch-all error, and we even tried to pass only one field to make sure it wasn't the data. No luck there either. The scenarios are consistent.

Streaming without a template suffix, works for tables that existed for a long time and newly created ones leading us to believe that the issue seems to be with the template suffix feature.

Not sure if related (probably not), but when streaming to the second dataset with date partitioned tables, the error rates are very similar. But in addition, when the insertAll call responds with no errors, around 40% of the time the data does not show up in the table even after a day.

1
Could you provide the specific error message for each failure? - Y Y
Almost all of them are: 503 Service Unavailable { "code" : 503, "errors" : [ { "domain" : "global", "message" : "Error encountered during execution. Retrying may solve the problem.", "reason" : "backendError" } ], "message" : "Error encountered during execution. Retrying may solve the problem.", "status" : "UNAVAILABLE" } - Rajiv Sunkara
You can send me more information like project_id, dataset_id, table_id, insertion time, so that I can take a look. - Y Y

1 Answers

0
votes

You're most probably experiencing a known bug which is throwing an INTERNAL_ERROR instead of SCHEMA_INCOMPATIBLE. So basically the system is incorrectly returning 503 error while the reason for that is a mistake in the table schema - the generated tables have different schema from template tables. Generated table should be created from the template table by the streaming system automatically.