1
votes

I would like to write an ETL Transform job, to

  • query elements from BigQuery
  • apply some transformation on them
  • serialise into the TFRecords format
  • dump the .tfrecords in a Cloud Storage bucket

The solution can possibly use Google Cloud products.

I tried using App Engine cron jobs, along with Google Cloud Python APIs but App Engine cannot be deployed with tensorflow (which is necessary for TFRecords serialisation).

Any recommendation on how this could be done smoothly?

1
You could write your pipeline in python then run the whole thing on a Google Cloud virtual machine?Ben P

1 Answers

1
votes

AppEngine standard can not run TensorFlow but AppEngine flexible can. See example here How to run TensorFlow in Google App Engine Flexible Enviroment?