I'm using a Google Cloud Function to run an ETL jop:
- Get data from a JSON API
- Enrich every row of that data with another API
- Write to Cloud Storage
A cloud scheduler cron job runs every night to trigger the cloud function. I can also run the pipeline manually to query for a specific date. The cloud function is written in Python.
The job ran always close to 9 minutes, but it worked fine for a couple of months. Unfortunately now I'm hitting the 9 minute hard limit of Google Cloud Functions and I'm wondering what my best options would be:
- Set up a compute engine
- Set up an app engine
- Work on the cloud function to parrallelize it and save time.
Are there any better options? What GCP service would be well suited for this task? Do you have any best practices? I really like the simplicity of cloud functions, but this comes with a tradeoff of course...