I wanted to be able to store Bigquery results as json files in Google Cloud Storage. I could not find an OOB way of doing this so what I had to do was
- Run query against Bigquery and store results in permanent tables. I use a random guid to name the permanent table.
- Read data from bigquery, convert it to json in my server side code and upload json data to GCS.
- Delete permanent table.
- Return the json file url in GCS to front end application.
While this works there are some issues with this.
A. I do not believe I am making use of BigQuery's caching by making use of my own permanent tables. Can someone confirm this? B. Step 2 will be a performance bottleneck. To pull data out of GCP to do JSON conversion to reupload into GCP just feels wrong. A better approach would be to use some cloud native serverless function or some other GCP data workflow type service to do this step that gets triggered upon creation of a new table in the dataset. What do you think is the best way to achieve this step? C. Is there really no way to do this without using permanent tables?
Any help appreciated. Thanks.