2
votes

I'm having some issues setting up Elastic logging in Apache Airflow. Since version 1.10 Elastic logging has been added to the configuration.

When looking at the airflow.cfg file we have two sections related to Elastic:

# Airflow can store logs remotely in AWS S3, Google Cloud Storage or Elastic Search.
# Users must supply an Airflow connection id that provides access to the storage
# location. If remote_logging is set to true, see UPDATING.md for additional
# configuration requirements.
remote_logging = True
remote_log_conn_id =
remote_base_log_folder =
encrypt_s3_logs = False

[elasticsearch]
elasticsearch_host = xxx.xxx.xxx.xxx
elasticsearch_log_id_template = {dag_id}-{task_id}-{execution_date}-{try_number}
elasticsearch_end_of_log_mark = end_of_log


Now I'm not really sure how to set this up. When looking at the airflow_local_settings.py file we can see the following pieces of code:

if REMOTE_LOGGING and REMOTE_BASE_LOG_FOLDER.startswith('s3://'):
        DEFAULT_LOGGING_CONFIG['handlers'].update(REMOTE_HANDLERS['s3'])
elif REMOTE_LOGGING and REMOTE_BASE_LOG_FOLDER.startswith('gs://'):
        DEFAULT_LOGGING_CONFIG['handlers'].update(REMOTE_HANDLERS['gcs'])
elif REMOTE_LOGGING and REMOTE_BASE_LOG_FOLDER.startswith('wasb'):
        DEFAULT_LOGGING_CONFIG['handlers'].update(REMOTE_HANDLERS['wasb'])
elif REMOTE_LOGGING and ELASTICSEARCH_HOST:
        DEFAULT_LOGGING_CONFIG['handlers'].update(REMOTE_HANDLERS['elasticsearch'])

So logically speaking if I set the remote logging to True and put the elastic's host/ip in the elastic section it should work. At the moment no logs are being generated from the airflow instance.

1

1 Answers

1
votes

According to Airflow ElasticsearchTaskHandler doc

    ElasticsearchTaskHandler is a python log handler that
    reads logs from Elasticsearch. Note logs are not directly
    indexed into Elasticsearch. Instead, it flushes logs
    into local files. Additional software setup is required
    to index the log into Elasticsearch, such as using
    Filebeat and Logstash.

Unfortunately this log handler doesn't flush logs to your ES directly.