7
votes

I'm a total noob to working with AWS. I am trying to get a pretty simple and basic operation to work. What I want to do is, upon a file being uploaded to one s3 bucket, I want that upload to trigger a Lambda function that will copy that file to another bucket.

I went to the AWS management console, created an s3 bucket on the us-west2 server called "test-bucket-3x1" to use as my "source" bucket and another called "test-bucket-3x2" as my 'destination' bucket. I did not change or modify any settings when creating these buckets.

In the Lambda console, I created an s3 trigger for the 'test-bucket-3x1', changed 'event type' to "ObjectCreatedByPut", and didn't change any other settings.

This is my actual lamda_function code:

import boto3
import json
s3 = boto3.resource('s3')


def lambda_handler(event, context):
    bucket = s3.Bucket('test-bucket-3x1')
    dest_bucket = s3.Bucket('test-bucket-3x2')
    print(bucket)
    print(dest_bucket)

    for obj in bucket.objects():
        dest_key = obj.key
        print(dest_key)
        s3.Object(dest_bucket.name, dest_key).copy_from(CopySource = {'Bucket': obj.bucket_name, 'Key': obj.key})

When I test this function with the basic "HelloWorld" test available from the AWS Lambda Console, I receive this"

{
  "errorMessage": "'s3.Bucket.objectsCollectionManager' object is not callable",
  "errorType": "TypeError",
  "stackTrace": [
    [
      "/var/task/lambda_function.py",
      12,
      "lambda_handler",
      "for obj in bucket.objects():"
    ]
  ]
}

What changes do I need to make to my code in order to, upon uploading a file to test-bucket-3x1, a lambda function is triggered and the file is copied to test-bucket-3x2?

Thanks for your time.

3
Shouldn't you be using for obj in bucket.objects.all() instead of for obj in bucket.objects(). Refer this link: boto3.readthedocs.io/en/latest/reference/services/…krishna_mee2004
"object isn't callable" - you're trying to iterate on that. I think you might be looking to use bucket.objects.all() which creates an iterableUsernamenotfound
Thanks for the help. It seems silly, but that was really useful for me. I can go to cloudwatch logs and start to get an idea for what the event and context objects actually are. On a related note, is it possible to open/work with a file in an s3 bucket via lambda? For instance, could I load a csv into a pandas dataframe, manipulate the dataframe, return the manipulated dataframe, and then upload that to my destination bucket? Would it be something as simple as putting with the event handler something like df = pd.read_excel(['Records']['bucket']['object']['key'])?Tkelly
Note there is an S3 bucket replication feature in AWS, if you are genuinely copying all new objects from one bucket to another.jarmod

3 Answers

3
votes

I would start with the blueprint s3-get-object for more information about creating a lambda from a blueprint use this page:

this is the code of the blueprint above:

console.log('Loading function');

const aws = require('aws-sdk');

const s3 = new aws.S3({ apiVersion: '2006-03-01' });


exports.handler = async (event, context) => {
    //console.log('Received event:', JSON.stringify(event, null, 2));

    // Get the object from the event and show its content type
    const bucket = event.Records[0].s3.bucket.name;
    const key = decodeURIComponent(event.Records[0].s3.object.key.replace(/\+/g, ' '));
    const params = {
        Bucket: bucket,
        Key: key,
    };
    try {
        const { ContentType } = await s3.getObject(params).promise();
        console.log('CONTENT TYPE:', ContentType);
        return ContentType;
    } catch (err) {
        console.log(err);
        const message = `Error getting object ${key} from bucket ${bucket}. Make sure they exist and your bucket is in the same region as this function.`;
        console.log(message);
        throw new Error(message);
    }
};

you will then need to update the code above to not only get the object info but to do the copy and the delete of the source, and for that you can refer to this answer:

const moveAndDeleteFile = async (file,inputfolder,targetfolder) => {
    const s3 = new AWS.S3();

    const copyparams = {
        Bucket : bucketname,
        CopySource : bucketname + "/" + inputfolder + "/" + file, 
        Key : targetfolder + "/" + file
    };

    await s3.copyObject(copyparams).promise();

    const deleteparams = {
        Bucket : bucketname,
        Key : inputfolder + "/" + file
    };

    await s3.deleteObject(deleteparams).promise();
    ....
}

Source:How to copy the object from s3 to s3 using node.js

0
votes
for object in source_bucket.objects.all():
    print(object)
    sourceObject = { 'Bucket' : 'bucketName', 'Key': object}
    destination_bucket.copy(sourceObject, object)
0
votes

You should really use the event from the lambda_handler() method to get the file [path|prefix|uri] and only deal with that file, since your event is being triggered on file being put in the bucket:

def lambda_handler(event, context):
    ...

    if event and event['Records']:
        for record in event['Records']:
            source_key = record['s3']['object']['key']

            ... # do something with the key: key-prefix/filename.ext

for the additional question about opening files from the s3Bucket directly, I would recommend to check smart_open, that "kind of" handles the s3Bucket like a local file system:

from pandas import DataFrame, read_csv
from smart_open import open

def read_as_csv(file_uri: str): -> DataFrame
    with open(file_uri) as f:
       return read_csv(f, names=COLUMN_NAMES)