1
votes

I'd like to write python script which manages my google data fusion pipelines and instances (creates new, deletes, starts, etc). For that purpose I use airflow installed as library. I've read some tutorials and documentations but I still can't make that script connect with data fusion instance. I've tried to use next string:

export AIRFLOW_CONN_GOOGLE_CLOUD_DEFAULT='google-cloud-platform://?extra__google_cloud_platform__key_path=%2Fkeys%2Fkey.json&extra__google_cloud_platform__scope=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fcloud-platform&extra__google_cloud_platform__project=airflow&extra__google_cloud_platform__num_retries=5'

with my data json key file and Project id but it still doesn't work. Can you give me an example of creating that connection?

1
Hello! Are you using Datafusion operator airflow.readthedocs.io/en/latest/_api/airflow/providers/google/… ? All components of the URI should be URL-encoded, have you followed the rule (check here)? - aga
@muscat, Hi! Yes, I do use that operator and I encoded the URI components. I'm just looking for step-by-step example how to create data fusion instance, add some pipelines and run it using python script. Because it doesn't work for me then I follow documentation instructions as I understood them. - Arty

1 Answers

1
votes

You can find an example python script here: https://airflow.readthedocs.io/en/latest/_modules/airflow/providers/google/cloud/example_dags/example_datafusion.html

This page provides a breakdown for each Data Fusion Operator if you would like to learn more about them: https://airflow.readthedocs.io/en/latest/howto/operator/gcp/datafusion.html