1
votes

I have a Python script which imported 3 libraries:

import pymysql
import pandas as pd
from sqlalchemy import create_engine

I'm planning to run Python Shell on AWS Glue. Following this and this doc pages, I created a setup.py:

from setuptools import setup

setup(name="pylibmodule",
      version="0.1",
      packages=[],
      install_requires=['sqlalchemy==1.3.9','pandas==0.25.3','pymysql==0.9.3']
  )

I ran python setup.py bdist_wheel, put the resulting pylibmodule-0.1-py3-none-any.whl file into an S3 bucket, and then specified the bucket location in the Glue Job setting. When I ran the job script, it produced an error.

After an investigation, I found that I have sucessfully imported the pandas module, but failed to import sqlalchemy and pymysql.

ModuleNotFoundError: No module named 'sqlalchemy'
ModuleNotFoundError: No module named 'pymysql'

What am I doing wrong?

1
as per the document, i think import should be from the named module? i.e.,can you try "from pylibmodule import sqlalchemy" instead of "import sqlalchemy". Also i hope your python script is in the same folder as the whl files? (As per point#4) - Yuva

1 Answers

1
votes

I ran the job again this morning without changing anything in the setting and the script. It suddenly works. I think the error I received last night was due to some cache residue on Glue's end.