4
votes

I have Lambda Function which attached to DynamoDB change event. The Lambda is triggering twice when I change/modify an item in the Test-machines table in the DynamoDB.

I am modifying IsMachineOn value from True to False, It is trigger Test-Machine-On-alert-status Lambda function twice.

I am not understanding why two-time lambda is a trigger.

I observed a small change in the records in the event parameter of Lambda.

For first trigger,

Value of NewImage["IsMachineOn"]["BOOL"] is False

Value of OldImage["IsMachineOn"]["BOOL"] is True

For Second trigger,

Value of NewImage["IsMachineOn"]["BOOL"] is False

Value of OldImage["IsMachineOn"]["BOOL"] is False

I have business logic on NewImage["IsMachineOn"]["BOOL"]==False so that my business logic is running twice.

There are two things:

  1. Why Lambda is running twice?
  2. What will be a workaround to fix this issue?
3
To guarantee delivering at least once, this multiple-invocation thing will probably happen. The point is, should we reckon that your Lambda function is not idempotent? If so, making it one would make a workaround.vahdet
@vahdet: My Lambda function is not idempotent. Every time request id is different.Vivek Sable
That behavior does not necessarily make your code idempotent, but anyway; if you strictly require firing only once, I cannot think of a solution for that for now.vahdet
Your logic should of course be testing NewImage["IsMachineOn"]["BOOL"] == False && NewImage["IsMachineOn"]["BOOL"] != OldImage["IsMachineOn"]["BOOL"] (is now off and also this is a state change event)... but it sounds as if a second, different update is triggering the second event, so you should probably review the other attributes, to identify the nature of that second event's trigger. This cannot -- by definition -- be a second Lambda trigger on the same event, if new and old differ in one and are the same in the other.Michael - sqlbot
There is a good blog post about this topic: cloudonaut.io/… The main learning from it: Make sure that your Lambda function is idempotent and takes care of potentially multiple executions.s.hesse

3 Answers

2
votes

We were having the same issue with our DynamoDB Global Table. What we observe is the double events happens only on the region you make the update, the other regions still receive 1 event, which is great.

The reason you get 2 events is DynamoDB need to maintain some built-in fields to prevent loop that regions update each other infinitely.

  1. first event is updating the attributes of the real object/fields stored, in your case is the IsMachineOn which changes from true to false.

  2. second event is updating the special attributes like aws:rep:deleting, aws:rep:updatetime, aws:rep:updateregion. That's why you see both old/new images have the IsMachineOn as false which is the new value.

Hopefully this helps clarify things a bit. It confuses me for hours.

TL;DR...

Typically you can just compare the aws:rep:updatetime of old/new images, if they are the same, then it is the event of updating aws internal fields so you can ignore.

In our use case we rely on the aws:rep:updateregion to make sure some logic just run once(not in multiple regions), so we have to compare the aws:rep:updatetime of old/new to ignore the first event which has the previous region info. Good thing is both events' new image has the correct values for object we stored.

update 7/23/2020

  • another important point we notice is if you use put() to update/create the record, the aws build-in fields will be missing in the new image of the 1st event, as well as the old image of the 2nd event. If you depend on these fields, better use update() for existing records to make sure they are always there.
1
votes

We had observed this issue while we were using global tables to sync data between dynamodb table in multiple region. And our assumption was that second push is made by global table after syncing the data between regions. I wrote a simple code to check if old and new images are really different and process the event only if they are different

def check_if_dynamo_entities_are_same(dyanmoStreamEvent):
    '''copying so that we dont change the incoming event'''
    dyanmoStreamEventCopy = copy.deepcopy(dyanmoStreamEvent)
    if( not 'NewImage' in dyanmoStreamEventCopy['dynamodb'] or not 'OldImage' in dyanmoStreamEventCopy['dynamodb']):
        logger.info("one of newimage or oldimage is not present returning true")
        return False
    remove_aws_keys(dyanmoStreamEventCopy['dynamodb']['NewImage'])
    remove_aws_keys(dyanmoStreamEventCopy['dynamodb']['OldImage'])
    return compare_two_json(dyanmoStreamEventCopy['dynamodb']['NewImage'], dyanmoStreamEventCopy['dynamodb']['OldImage'])

def remove_aws_keys(dic):
    for k in dic.copy():
        if k.startswith('aws:'):
            logger.info("poping key=%s", k)
            dic.pop(k)

def ordered(obj):
    if isinstance(obj, dict):
        return sorted((k, ordered(v)) for k, v in obj.items())
    if isinstance(obj, list):
        return sorted(ordered(x) for x in obj)
    else:
        return obj


def compare_two_json(json1, json2):
    """This method return true or false if the given jsons are equal or not.
    This has been taken from https://stackoverflow.com/a/25851972/3892213"""
    return ordered(json1) == ordered(json2)
1
votes

I also would check, if the record has changed! Therefore I wrote the following code in Python 3.6

old_sites = set()
    new_sites = set()

    # Calculate the disjoint quantity of NEW & OLD mappings
    for image_name in ['OldImage', 'NewImage']:
        if record['dynamodb'] is not None and image_name in record['dynamodb']:
            ddb_entry = record['dynamodb'][image_name]
            mappings = ddb_entry[value_key]['L']
            print(f"mappings: {mappings}")

            for mapping in mappings:
                old_sites.add(mapping['S']) if image_name == 'OldImage' else new_sites.add(mapping['S'])

    changed_mappings = old_sites.symmetric_difference(new_sites)