1
votes

I’m attempting to combine IAM Database Authentication(https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/UsingWithRDS.IAMDBAuth.html) with Airflow’s sql_alchemy_conn_cmd(https://airflow.apache.org/howto/set-config.html#) with the a connect shell script to try and secure a connection.

connect.sh

#!/bin/bash
token=`aws rds generate-db-auth-token --hostname  an_rds_endpoint --port 3306 --region us-east-1 --username airflow`
url=“mysql://airflow:'$token'@an_rds_endpoint/airflow”
sed  "s/%/%%/g" <<< "$url”

I currently have an EC2 instance with the mysql dev tools connecting successfully to an RDS MySQL database via IAM DB Authentication using these steps(https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/UsingWithRDS.IAMDBAuth.Connecting.AWSCLI.html)

The trouble I am having is that I am getting the following error:

super(Connection, self).__init__(*args, **kwargs2)
OperationalError: (OperationalError) (1045, "Access denied for user ‘airflow’@‘ip_address_of_ec2_instance’ (using password: YES)")

…after I attempt to run airflow initdb with my sql_alchemy_conn_cmd = connect.sh in my airflow.cfg file.

My initial guess is that the encoding on the “insanely verbose token AWS generates” is the issue but I was wondering if anyone had gone down this road yet and can help.

This is part of a token that’s generated for reference.

rdsmysql.cdgmuqiadpid.us-west-2.rds.amazonaws.com:3306/?Action=connect&DBUser=jane_doe&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Expires=900...
1
my initial guess is the single quotes around the token are not helping.dlamblin

1 Answers

0
votes

My initial guesses are that the single quotes are not helping with the token, but maybe they are needed, I don't know much about using the tokens.

My second issue with the connect.sh is the triple angle HEREDOC syntax. I think you want:

- sed  "s/%/%%/g" <<< "$url”
+ sed  "s/%/%%/g" << "$url”

But maybe you just made a typo in StackOverflow because this wouldn't get you to starting the connection.

So it could be several more things: you don't note that you granted the airflow user access specifically, nor that that's the user you're using with mysql tools (which is working).

The scheduler and web server may be making multiple connections to the db as they loop over the files to fill the dag bag (iirc it's one connection per file parsed, which happens in a loop as frequently as possible). AWS says this token doesn't work for a lot of connections.

The token is time limited, but you need it for a couple of long running processes, and the workers too (they make a new connection for each task executed). But it should work for the initdb command at least.

The good news seems to be that connect.sh is being run, and you're getting an error from MySQL.

Maybe throw in a little debugging:

sed  "s/%/%%/g" << "$url” | tee debug_connect_sh.txt

Then edit your airflow.cfg with this output assigned to sql_alchemy_conn for a bit (removing sql_alchemy_conn_cmd), and iterate trying your airflow initdb until you can figure out what isn't parsing well in the connection string. Then circle back to fix the connect.sh. Please share results.