6
votes

We have many AWS connection string in apache airflow and anyone can see our access keys and secret keys in airflow webserver connections section. How to hide or mask sensitive data in airflow webserver?

enter image description here

We have already enabled authentication true in airflow configuration so it won't allow unauthorized users. But I don't want to show my keys in web view.

4

4 Answers

3
votes

For the Airflow Variables section, Airflow will automatically hide any values if the variable name contains secret or password. The check for this value is case-insensitive, so the value of a variable with a name containing SECRET will also be hidden.

1
votes

I found workaround for this use case. There is an option in airflow AWSHook, We can pass key path in connection string instead of secret key and access key.

enter image description here

/root/keys/aws_keys

[default]
aws_access_key_id=<access key>
aws_secret_access_key=<secret key>
region=<region>

[s3_prod]
aws_access_key_id=<access key>
aws_secret_access_key=<secret key>
region=<region>

0
votes

The LDAP authentication module provides the ability to specify a group based filter for a group which will be admins, and able to see that menu, and the rest.

See the documentation under security.

The superuser_filter and data_profiler_filter are optional. If defined, these configurations allow you to specify LDAP groups that users must belong to in order to have superuser (admin) and data-profiler permissions. If undefined, all users will be superusers and data profilers.

Note that data-profilers can run adhoc queries on any defined connection. They cannot see the admin menu however. You may not want a group of users to be able to do arbitrary SQL or whatever over these, so also set that filter.

Any user can request in their DAG and tasks any variable. It's easy to put those variables in places where they will show up in the logs.

The database provides a way of storing the connection passwords, and variable values in an encrypted way, but that doesn't solve all your problems.

0
votes

I'm pretty sure the AWS hook also allows you to put the access key in the "Login" box and the secret key in the "Password" box on the connection screen. If the hook finds that there is something in the login box it'll use that and the password box as the connection information, here is the snippet from the source code for the AWS Hook:

    if self.aws_conn_id:
        try:
            connection_object = self.get_connection(self.aws_conn_id)
            if connection_object.login:
                aws_access_key_id = connection_object.login
                aws_secret_access_key = connection_object.password

            elif 'aws_secret_access_key' in connection_object.extra_dejson:
                aws_access_key_id = connection_object.extra_dejson['aws_access_key_id']
                aws_secret_access_key = connection_object.extra_dejson['aws_secret_access_key']

            elif 's3_config_file' in connection_object.extra_dejson:
                aws_access_key_id, aws_secret_access_key = \
                    _parse_s3_config(connection_object.extra_dejson['s3_config_file'],
                                     connection_object.extra_dejson.get('s3_config_format'))

I've also found that you need to specify the region_name in the "Extra" box for the AWSHook in Airflow 1.9 otherwise it will not work.