1
votes

I'm trying to use dataflow to stream into BQ partitioned table. The documentation says that:

Data in the streaming buffer has a NULL value for the _PARTITIONTIME column.

I can see that's the case when inserting rows into a date partitioned table.

Is there a way to be able to set the partition time of the rows I want to insert so that BigQuery can infer the correct partition?

So far I've tried doing: tableRow.set("_PARTITIONTIME", milliessinceepoch); but I get hit with a no such field exception.

2

2 Answers

1
votes

As of a month or so ago, you can stream into a specific partition of a date-partitioned table. For example, to insert into partition for date 20160501 in table T, you can call insertall with table name T$20160501

0
votes

AFAIK, as of writing, BigQuery does not allow specifying the partition manually per row - it is inferred from the time of insertion.

However, as an alternative to BigQuery's built-in partitioned tables feature, you can use Dataflow's feature for streaming to multiple BigQuery tables at the same time: see Sharding BigQuery output tables.