1
votes

We have a partitioned BigQuery table that's partitioned on a custom field (event_timestamp). This was due to needing to import historical data while still keeping things partitioned and query costs under control.

This means the table doesn't have a _PARTITIONTIME field, and we can't do queries WHERE _PARTITIONTIME IS NULL to view recently sent records sitting in the streaming buffer.

So is there a trick to being able to query the streaming buffer even if you're using a custom field for partitioning?

Edit We're using Standard SQL, not Legacy SQL, sorry for not being explicit about that previously.

1
can you say more about your goal when querying just for the data existing in the buffer?Felipe Hoffa
@FelipeHoffa Primarily it's to test/verify that functions are working as expected and writing/capturing the expected data. The delays with flushing make this lack of visibility a regular stumbling block. Typically when you don't get the results back you expect, you assume it's your code - in this case it's actually working, but things are hidden in the buffer.EyePulp

1 Answers

3
votes

Even though querying tables with time partitioning field supports only via standard SQL, I believe Legacy SQL is still supported to query the streaming inserts into the table

#legacySQL
select * from [<project-name>:<data-set>.<table>$__UNPARTITIONED__] 

Reference:

https://cloud.google.com/bigquery/docs/partitioned-tables

Query Streaming Data Using --time_partitioning_field