0
votes

I am streaming data into bigquery and the logs are recording no issues at all, when I run "SELECT * FROM datatable WHERE _PARTITIONTIME = TIMESTAMP("2018-11-05") LIMIT 1000" it only returns 16 rows, the row data keeps changing as new data flows in, but it only returns 16 rows.

The Streaming buffer statistics show that there are rows in the buffer.

I started the stream almost 10 hours ago so I would assume there would be some data that I could access.

I am at a bit of a loss here as I cant see errors

This is some sample data that was collected

https://docs.google.com/spreadsheets/d/1Svm6cDWzSvD0RHGo_O5J16UDvqFfDAK5irNki5nYtos/edit?usp=sharing

1
Could you provide details on how are you streaming the data? I would like to verify that you are not constantly overriding the first 16 lines.Rubén C.
I am using this linkuser3895426
It seems to overwrite, the Streaming buffer statistics keeps showing the actual rows being inserted in the logs, but the actual table stays the same.user3895426
The fact that some data are in Streaming Buffer and not yet in tables should be irrelevant. This is internal and the data should be queried as soon as they are inserted (into Streaming buffer or table). - For details of querying partitioned tables cloud.google.com/bigquery/docs/querying- partitioned-tables - For some time data may stay in UNPARTITIONED partition cloud.google.com/bigquery/docs/…Alex Riquelme

1 Answers

2
votes

From the documentation:

when streaming to a partitioned table, data in the streaming buffer has a NULL value for the _PARTITIONTIME pseudo column.

You should change your filter to:

WHERE _PARTITIONTIME IS NULL OR _PARTITIONTIME = TIMESTAMP "2018-11-05"

This will include data for the specified date as well as date that is currently in the streaming buffer.