3
votes

I have installed airflow and I've written a DAG to integrate MySQL data with BigQuery.

When I run the python script, I got the following error:

ImportError: cannot import name GbqConnector

I followed the instruction to downgrade pandas to an older version. When I did so I then got a different error:

ImportError: cannot import name _test_google_api_imports

Edit: the advise from x97Core worked.

I have a different problem now. I am getting the following error:

/usr/local/lib/python2.7/dist-packages/airflow/models.py:1927: PendingDeprecationWarning: Invalid arguments were passed to MySqlToGoogleCloudStorageOperator. Support for passing such arguments will be dropped in Airflow 2.0. Invalid arguments were:

*args: ()

**kwargs: {'google_cloud_storage_connn_id': 'podioGCPConnection'} category=PendingDeprecationWarning

/usr/local/lib/python2.7/dist-packages/airflow/models.py:1927: PendingDeprecationWarning: Invalid arguments were passed to GoogleCloudStorageToBigQueryOperator. Support for passing such arguments will be dropped in Airflow 2.0. Invalid arguments were:

*args: ()

**kwargs: {'project_id': 'podio-data'} category=PendingDeprecationWarning

According to this link, the issue is with airflow's compatibility with python2 and pythn3. I've tried running the code on both but the same error still comes up. Airflow mysql to gcp Dag error

Does anyone know if there is a solution for this?

2

2 Answers

7
votes

Just ran into this issue.

Downgrading the pandas version seem to work (tested on Airflow v1.8.0):

pip install pandas==0.18.1

For more details: https://issues.apache.org/jira/browse/AIRFLOW-1179

Or if you are using Airflow 1.8.2 or above.

pip install pandas-gbq
4
votes

This can be a little be complicated. I suggest that you read these two fantastic links

https://wecode.wepay.com/posts/wepays-data-warehouse-bigquery-airflow

And Van Boxel's medium

https://medium.com/google-cloud/airflow-for-google-cloud-part-1-d7da9a048aa4