I am building an infrastructure where I'd like to sink hot and cold data separately. For the hot data, I'm writing my data to Cloud Spanner, and for the cold data, I'd like to write my data to something more persistent like BigQuery.
I'm consuming data from a streaming service, but I'd like to take advantage of BigQuery's caching mechanism - which won't be possible if I'm constantly streaming the cold data into BigQuery. My problem is around whether I can fork a stream pipeline into a batch pipeline and have the stream pipeline connected to Spanner and the batch pipeline connected to BigQuery.
I can envision something along the lines of writing the cold data into Cloud Storage and reading the data into BigQuery using a cron job, but is there a better/native way to achieve the Stream+Batch split?