2
votes

I am new to AWS Lambda and I want to run code on Lambda for a machine learning API. The functions that I want to run on Lambda are, in summary, one to read some csv files to create a pandas dataFrame and search in it and the other to run some pickled machine learning models through requests from a Flask application. To do this, I need to import pandas, joblib and possibly scikit-learn which are compatible with Amazon Linux. I am using a Windows machine.

In general, I am going with the approach of using Lambda's layers by uploading zip files. Of course, since Lambda has a pre-built layer with SciPy and Numpy so I will not import them. If I import them, I will exceed Lambda's layer limit anyway. To be more specific, I have done the following:

  • Downloaded and extracted linux-compatible versions of the libraries listed above. For example: From this link I have downloaded "pandas-0.25.0-cp35-cp35m-manylinux1_x86_64.whl" and unzipped to a folder.
  • The unzipped libraries are in the following directory:

    lambda_layers\python\lib\python3.7\site-packages

  • They are zipped into a file and uploaded onto S3 Bucket for creating a layer.

I imported the packages:

import json
import boto3
import pandas as pd

I got the following error from Lambda:

{ "errorMessage": "Unable to import module 'lambda_function': C extension: No module named 'pandas._libs.tslibs.conversion' not built. If you want to import pandas from the source directory, you may need to run 'python setup.py build_ext --inplace --force' to build the C extensions first.", "errorType": "Runtime.ImportModuleError" }

3
Can you try this: github.com/keithrozario/Klayers , there is a pandas layer I haven't tested that is publicly available. arn:aws:lambda:<region>:113088814899:layer:Klayers-python37-pandas:<version>keithRozario

3 Answers

3
votes

Folder structure should be standard, you can also use Docker to create the zipped Linux compatible library and upload it in AWS Lambda layers. Below are the tested commands to create the zipped library for AWS Lambda layer:

  1. Create and navigate to a directory :

    $mkdir aws1
    $cd aws1
    
  2. Write the below commands in Dockerfile and exit by CTRL + D :

    $cat> Dockerfile
    
    FROM amazonlinux:2017.03  
    RUN yum -y install git \ 
        python36 \  
        python36-pip \  
        zip \  
        && yum clean all  
    RUN python3 -m pip install --upgrade pip \  
        && python3 -m pip install boto3
    
  3. You can provide any name for the image :

    $docker build -t pythn1/lambda .
    
  4. Run the image :

    $docker run --rm -it -v ${PWD}:/var/task pythn1/lambda:latest bash  
    
  5. Specify the package which you want to zip, in requirements.txt and exit by CTRL + D :

    $ cat > requirements.txt
    pandas
    sklearn
    
  6. You can try using correct file structure (/python/lib/python3.6/site-packages/) here, but I did not test it yet :

    $pip install -r requirements.txt -t /usr/lib/python3.6/dist-packages/ 
    
  7. Navigate to the below directory :

    $cd var/task
    
  8. Create a zip file :

    $ zip -r ./layers.zip /usr/lib/python3.6/dist-packages/  
    

You should be able to see a layers.zip file in aws1 folder. If you provide the correct folder structure while installing, then the below steps are not required. But, with the folder structure I used, below commands are required :

  1. Unzip layers.zip.
  2. Exit Docker or open a new terminal and navigate to the folder where you unzipped the file. Unzipped file will be in the folder structure /usr/lib/python3.6/dist-packages/.

  3. Copy these files to the correct folder structure :

    $ cp -r ./python/lib/python3.6/site-packages/ /usr/lib/python3.6/dist-packages/
    
  4. Zip them again :

    $ zip -r ./lib_python.zip ./python
    
  5. Upload the zip file to the layer, and add that layer to your Lambda function. Also, make sure that you select the right running environment while creating the layer.

0
votes

Following this document - https://docs.aws.amazon.com/lambda/latest/dg/configuration-layers.html#configuration-layers-path, you should zip python\lib\python3.7\site-packages\pandas (and other dependencies) folder for your python layers.

Make sure you add the layer to your function and follow the documentation for the right permissions.

0
votes

I appreciate the answers that were given, just posting my own answer (that I found after a whole day looking) here for reference purpose.

I followed this guide and also this guide.

In summary, the steps to what I did are:

  1. Connect to my Amazon EC2 instance (running on Linux) through ssh. I
    wanted to deploy an application on Beanstalk so it was already up for me anyway.
  2. Follow the steps in the first guide to install python 3.7. Follow the steps in the second guide to install the libraries. One of the key notes is not to install with pip install -t since that will lead to the libraries and the C extensions not built.
  3. Zip the directory found in python\lib\python3.7\site-packages\ as
    mentioned by the answers here (although I did follow the directory
    guide in my first attempts)
  4. Get the file from EC2 instance through
    FileZilla.
  5. Follow the Lambda layers guide and it is done.