0
votes

I have a csv file (delimiter | instead of ,) in the cloud storage bucket, which has data formats say 10022019 as DATE , however I need to transform it to the accepted DATE format of 2019-02-10 in bigquery, can I achieve this by a could function which reads and transform the data at the same time before it streams insert the data into a bigquery table?

Thanks.

Best regards,

1
Load the table as it is, using STRING, and later write a query to parse date and create a new table. - Pentium10
Thank you very much. How do I trigger the next function which executes the query to transform the data? (I have the queries already and tested.) Does GCP create any event for the load job which loads the csv file into the bigquery table? I was told GCP does not create any event for the load job , so there is no way we can now if the load job has successfully steamed the data into the bigquery? Sorry I am new to the cloud function thing. So my question might seem quite naive :) - JJZ
Right now, you need to wait for the load job to finish synchronously. The other option is to get the JobID, induce a delay, and call another Function in loop until it sees the job status finished. Emitting events is high voted feature, and it might appear soon. - Pentium10
Thank you very much. It looks like waiting for the job to finish successfully before triggering another function would be the safer option then. - JJZ

1 Answers

1
votes

Since you have your data in cloud storage you may consider a cloud function to adjust data quality before streaming/loading to BigQuery.

If you were to use load BigQury API you can consider serverless rule base data quality adjustment with StorageMirror, followed by rule base data loading with BqTail