I've got an ETL process which streams data from a mongo cluster to BigQuery. This runs via cron on a weekly basis, and manually when needed. I have a separate dataset for each of our customers, with the table structures being identical across them.
I just ran the process, only to find that while all of my data chunks returned a "success" response ({"kind": "bigquery#tableDataInsertAllResponse"}) from the insertAll api, the table is empty for one specific dataset.
I had seen this happen a few times before, but was never able to re-create. I've now run it twice more with the same results. I know my code is working, because the other datasets are properly populated.
There's no 'streaming buffer' in the table details, and running a count(*) query returns 0 response. I've even tried removing cached results from the query, to force freshness - but nothing helps.
Edit - After 10 minutes from my data stream (I keep timestamped logs) - partial data now appears in the table; however, after another 40 minutes, it doesn't look like any new data is flowing in.
Is anyone else experiencing hiccups in streaming service?
Might be worth mentioning that part of my process is to copy the existing table to a backup table, remove the original table, and recreate it with the latest schema. Could this be affecting the inserts on some specific edge cases?