4
votes

I've been using BigQuery for about 2 months. During that time I've used streaming insertion to add thousands of entries every minute. I've been able to then query over that data within a few minutes, if not practically instantly.

Starting a few days ago though, one of my tables suddenly starting showing delays in data availability ranging from 20 to 60 minutes. This only occurs with one of my tables. Data inserted into other tables remain available nearly instantly.

Is this kind of data availability delay normal for BigQuery?

The table experiencing this problem is accuAudience.trackPlays. I will gladly provide project ID and other info to a Google team member.

The results of the streaming inserts into the problematic table are:

{'kind': 'bigquery#tableDataInsertAllResponse'}

Example query from problematic table, accuAudience.trackPlays (ordered by date desc):

ROW DATE COUNT
1 2015-03-30 12:35:32 UTC 67
2 2015-03-30 12:35:31 UTC 65
3 2015-03-30 12:35:30 UTC 56
4 2015-03-30 12:35:29 UTC 45
5 2015-03-30 12:35:28 UTC 60

Same query made seconds later to different table (accuAudience.trackSkips). Note the date field is 30 minutes ahead of the earlier query.

ROW DATE COUNT
1 2015-03-30 13:04:03 UTC 1
2 2015-03-30 13:04:02 UTC 1
3 2015-03-30 13:04:01 UTC 3
4 2015-03-30 13:04:00 UTC 3
5 2015-03-30 13:03:59 UTC 6

If there's other information needed, please let me know!

1
This is the 2nd question with similar issue, we need some official answer from BQ team (it might be a performance issue). Please post your project and table, so when someone from BQ team checks this to be able to look into. Linked other question: stackoverflow.com/questions/29246369/…Pentium10
Thanks! I've updated my post with table names and I can provide my project ID and any other necessary info to a Google team member if need be.Michael Schmitt
It's 2017 and streamed data still appears with 10+ minute delays. For me, sending select * queries helped find the data. But why does it sit in the streaming buffer for hours? A couple of similar issues, for reference: stackoverflow.com/questions/39407558/… stackoverflow.com/questions/22867090/…Anton Tarasenko

1 Answers

4
votes

BigQuery periodically runs background maintenance tasks to optimize your tables for querying. One of these background tasks caused a hiccup with the streaming process. This caused us to not be able to read from the streaming buffer until it was flushed. Note that you might have seen this as an ongoing issue while you were continually streaming to the table.

It is fixed now. If you continue to see the problem, please let us know what table & project you are seeing the issue with.