0
votes

I am working on a DAG that queries a MySQL database, extracts and loads the data to Google Cloud Storage.

The table that I am trying to export includes text, int, float, varchar(20) and varchar(32) data.

I am using airflow v1.8.0.

default_args = {
    'owner' : 'tia',
    'start_date' : datetime(2018, 1, 4),
    'depends_on_past' : False,
    'retries' : 1,
    'retry_delay':timedelta(minutes=5),
}

dag = DAG('mysql_to_gcs', default_args=default_args)

export_waybills = MySqlToGoogleCloudStorageOperator(
   task_id='extract_waybills',
   mysql_conn_id = 'podiotestmySQL',
   sql = 'SELECT * FROM podiodb.logistics_waybills',
   bucket='podio-reader-storage',
   filename= 'podio-data/waybills{}.json',
   schema_filename='podio-data/schema/waybills.json',
   dag=dag)

I cam across the following error, which seems to be similar to this Airflow mysql to gcp Dag error

[2018-01-04 11:12:23,372] {models.py:1342} INFO - Executing on 2018-01-04 00:00:00

[2018-01-04 11:12:23,400] {base_hook.py:67} INFO - Using connection to: 35.189.207.140

[2018-01-04 11:12:24,903] {models.py:1417} ERROR - a bytes-like object is required, not 'str'

Traceback (most recent call last):

File "/home/hyperli/.local/lib/python3.5/site-packages/airflow/models.py", line 1374, in run

result = task_copy.execute(context=context)

File "/home/hyperli/.local/lib/python3.5/site-packages/airflow/contrib/operators/mysql_to_gcs.py", line 91, in execute

files_to_upload = self._write_local_data_files(cursor)

File "/home/hyperli/.local/lib/python3.5/site-packages/airflow/contrib/operators/mysql_to_gcs.py", line 136, in _write_local_data_files

json.dump(row_dict, tmp_file_handle)

File "/usr/lib/python3.5/json/init.py", line 179, in dump fp.write(chunk)

File "/usr/lib/python3.5/tempfile.py", line 622, in func_wrapper return func(*args, **kwargs)

TypeError: a bytes-like object is required, not 'str'

[2018-01-04 11:12:24,907] {models.py:1433} INFO - Marking task as UP_FOR_RETRY

[2018-01-04 11:12:25,037] {models.py:1462} ERROR - a bytes-like object is required, not 'str'

Does anyone know why the exception is thrown?

1
Can you post your MySQL schema or an example of a line you are exporting?A.Queue
Also please provide version of airflow.A.Queue
The version of airflow is 1.8.0 and the mySQL table I'm trying to export consists of int(11), float, text, date, varchar(20) and varchar(32) data.Tia
Which version of python are you using? Check out the answer I posted.A.Queue

1 Answers

1
votes

Are you using python 3? Because it seams mysql_to_gcs of the latest version (1.90) is not python compatible it seams.

It seams that this operator was changed here to make mysql_to_gcs py3 compatible. But the latest version(1.90) doesn't include this change