0
votes

I'm using dynamodb stream to trigger a lambda function.

My serverless.yml file is like this:

functions:
  main:
    handler: app.main.handler
    events:
      - http:
          method: any
          path: /{proxy+}
      # to keep lambda function warm
      - schedule:
          rate: rate(5 minutes)
          input:
            warmer: true
      # triggered when a new insertion is made in the dynamodb table
      - stream:
          type: dynamodb
          arn:
            Fn::GetAtt: [AsyncTaskTable, StreamArn]

resources:
  Resources:
    AsyncTaskTable:
      Type: 'AWS::DynamoDB::Table'
      Properties:
        TableName: ${self:custom.AsyncTaskTableName}
        AttributeDefinitions:
          -
            AttributeName: "uuid"
            AttributeType: "S"
        KeySchema:
          -
            AttributeName: "uuid"
            KeyType: "HASH"
        ProvisionedThroughput:
          ReadCapacityUnits: 1
          WriteCapacityUnits: 1
        TimeToLiveSpecification:
            AttributeName: "deletion_date_time"
            Enabled: true
        StreamSpecification:
          StreamViewType: NEW_IMAGE

and my handler is like this:

def handler(event, context):
    print(event)

    if event.get('warmer'):
        pass

    elif event.get('Records'):
        print('process async')

    # On convertit au bon format de dictionnaire
    event = json_util.loads(event)

    for record in event['Records']:
        if record['eventName'] == 'INSERT':
            python_module = record['dynamodb']['NewImage']['python_module']
            python_function = record['dynamodb']['NewImage']['python_function']
            uuid = record['dynamodb']['NewImage']['uuid']
            params = record['dynamodb']['NewImage']['params']
            getattr(sys.modules[python_module], python_function)(uuid, params)

    else:
        print('else')

Everything works perfectly with the dynamodb table and the handler, but for a reason I don't understand, the event received by my handler is always the same when it is occurring from a stream!

If it's the scheduled event or the http event, then I get the correct event and always corresponding to the data sent, but when it's triggered by the dynamodb stream, it's always the same event!

I've been struggling for 3 hours now trying to figure it out but can't get it, I create a new record in my dynamodb table which has nothing to do with a previous one and I still receive an event which has the data of the same event.

I've deleted all the item in the dynamodb table but still it's the same, I receive an event which I don't know where it is coming. It is always the same.

For instance below, the print(event) I always have whatever I do on my dynamo db table (inserting/deleting), I've created a new table for my staging/production environment and both behaves the same way In the log you can see that the date of the log doesn't match the eventSourceARN, this corresponds to the first creation of an item in the db, ie the first time the dynamodb streams was called. I'm also precizing that my handler is also doing something else which fails. Could that be the reason ? ie it replays the stream as long as my process doesn't work properly ?

2020-06-09T20:45:06.963+02:00
{'Records': [{'eventID': 'b607c13dc12e16d6602890fb7ab6f418', 'eventName': 'INSERT', 'eventVersion': '1.1', 'eventSource': 'aws:dynamodb', 'awsRegion': 'eu-west-3', 'dynamodb': {'ApproximateCreationDateTime': 1591726134.0, 'Keys': {'uuid': {'S': '1234'}}, 'NewImage': {'uuid': {'S': '1234'}}, 'SequenceNumber': '100000000002840891918', 'SizeBytes': 16, 'StreamViewType': 'NEW_IMAGE'}, 'eventSourceARN': 'arn:aws:dynamodb:eu-west-3:213248478927:table/async_task-production/stream/2020-06-09T17:58:45.451'}]}
1
What does your 'Records' handler path actually do? Is the Lambda request ID the same each time (i.e. is it a retry of the same event)? How many times does it repeat with the same data?jarmod
my Records path is used to detect when the handler is called from the dyanmodb streams (in this case there is a Records list of record). It enters properly in this condition. The request ID is always the one of the first item created in the db, even though it has been deleted from the db. for instance I create '1234' uuid, delete it, then create 'aa' uuid. my print(event) statement always print the new image of the first '1234' INSERTyeye
I know why you have the Records handler. I’m asking what it does, as in what code does it execute? On the request ID, I’m not asking about the DynamoDB item primary key, I’m asking about the Lambda function request ID.jarmod
I've updated my post to add the lines which are executed, basically it calls a module.function given in the data, with uuid and params as parameters so that some data processing are handled asynchronously. BTW, just keeping the print(event) without my processing does print correctly the new image data, but adding those lines make it replay the same event... Don't understand whyyeye
You need to understand if the streams-triggered invocation is failing and being retried. The Lambda request ID being the same across invocations will confirm that. Also look at CloudWatch Logs for this Lambda function. Is it completing successfully or failing? What errors do you see?jarmod

1 Answers

1
votes

This is normal. If DynamoDB Streams triggers your Lambda function and your Lambda function fails, then the Lambda invocation will be retried with the same data until successful or expired by the event source.

Lambda now has support for additional failure-handling features, but in the typical case you should simply fix the error in your Lambda function so that it does not fail and hence will not be retried.