0
votes

If you use a EC2 hosted Zeppelin Notebook, it seems to me that it is implied that the AWS Glue libraries are available for use based on this tutorial: https://docs.aws.amazon.com/glue/latest/dg/dev-endpoint-tutorial-EC2-notebook.html

Is this assumption incorrect? I have done research on this and it seems that others who have struggled with this had IAM permissions issues.

I have given the Development endpoint these roles (some which may not be necessary, but I am desperate...):

  • AWSGlueConsoleSageMakerNotebookFullAccess
  • AWSGlueServiceNotebookRole
  • AmazonS3FullAccess
  • AWSGlueConsoleFullAccess
  • AWSGlueServiceRole

The EC2 instance has the following permissions:

  • AWSGlueConsoleSageMakerNotebookFullAccess
  • AWSCloudFormationFullAccess
  • CloudWatchLogsFullAccess
  • AmazonAthenaFullAccess
  • AWSGlueServiceNotebookRole
  • AmazonS3FullAccess
  • AWSGlueConsoleFullAccess
  • AWSGlueServiceRole
Traceback (most recent call last):
  File "/tmp/zeppelin_pyspark-4247400967984532782.py", line 349, in <module>
    raise Exception(traceback.format_exc())
Exception: Traceback (most recent call last):
  File "/tmp/zeppelin_pyspark-4247400967984532782.py", line 337, in <module>
    exec(code)
  File "<stdin>", line 3, in <module>
ImportError: No module named awsglue.context
1

1 Answers

0
votes

The answer for me was to allow network communication between Zeppelin and Glue. You have to explicitly configure this network rule.