6
votes

I'm trying to create a Glue Job that enumerates all tables in a database in my catalog. In order to do so I use the following code snippet:

session = boto3.Session(region_name='us-east-2')
glue = session.client('glue')
tables = glue.get_tables(
    DatabaseName='customer1'
)
print tables

The job hangs for about 15 minutes and the connection appears to be refused, because I eventually get the following error:

botocore.vendored.requests.exceptions.ConnectTimeout: HTTPSConnectionPool(host='glue.us-east-2.amazonaws.com', port=443): Max retries exceeded with url: / (Caused by ConnectTimeoutError(, 'Connection to glue.us-east-2.amazonaws.com timed out. (connect timeout=60)’))

This issue is specific to the glue API. I can use the S3 API with no problems.

I've gone through all my security groups and opened up all the ports to traffic from anywhere. I've even added self-referencing rules. But this is to no avail.

I can't figure out what could be causing the connection to be blocked. Is AWS specifically blocking glue requests?

1
I am running into the same issue.Roger
I have the same problem when running glue boto client commands from Glue Dev Endpoint. However when running as a normal glue job all boto3 commands run successfully.botchniaque

1 Answers

1
votes

I was facing the same problem that boto3 calls to glue or s3 were hanging and eventually timing out.

I fixed it by changing the subnet-id when creating the dev-endpoint. Initially I was using a subnet that routed traffic to an Internet Gateway. I switched to a subnet routing traffic to an internal NAT gateway. Hope this helps.