How to query for data in streaming buffer ONLY in BigQuery?

Question

We have a table partitioned by day in BigQuery, which is updated by streaming inserts.

The doc says that: "when streaming to a partitioned table, data in the streaming buffer has a NULL value for the _PARTITIONTIME pseudo column"

But if I query for select count(*) from table where _PARTITIONTIME is NULL it always returns 0, even though bq show tells me that there are a lot of rows in the streaming buffer.

Does this mean that the pseudo column is not present at all for rows in streaming buffer? In any case, how can I query for the data ONLY in the streaming buffer without it becoming a full table scan?

Thanks in advance

what is the practical use case for this? i don't think you can query/read streaming buffer, but if you explain why you think you need to be able to read it - we might figure out how to make it — Mikhail Berlyant
I have a streaming job that keeps updating a table in BigQuery. I have a downstream job that triggers every 15 mins and aggregates the data for the day so far - so it needs to query something equivalen to where _PARTITIONTIME = today OR data_in_streaming_buffer. Is there any way to achieve this? Thanks. — Venkatesh Iyer

Pentium10 Pentium10 · Accepted Answer · 2017-02-02T07:51:51

Data in the streaming buffer has a NULL value for the _PARTITIONTIME column.

SELECT
  fields
FROM
  `dataset.partitioned_table_name`
WHERE
  _PARTITIONTIME IS NULL

https://cloud.google.com/bigquery/docs/partitioned-tables#copying_to_partitioned_tables

How to query for data in streaming buffer ONLY in BigQuery?

2 Answers