0
votes

I have a small computer simulation set up in AWS Lambda (Python 3.8) which requires to run a complex function several hundred times with different input parameters. The calculations don't depend on one another, so they can run in parallel. I have a "parent function" to provide the inputs and a "child function" that will be called multiple times to do the calculations. The parent function then receives the results and I can move forward to working with them.

The issue I am facing is that once I switch from consecutive invocation (InvocationType = 'RequestResponse') to concurrent invocation (InvocationType = 'Event'), I get an error of the type [JSONDecodeError: Expecting value: line 1 column 1 (char 0)]. Even with a high resource Lambda "child function" my calculation takes 10-20 seconds per run, so I need these to run in parralel. All the responses to similar questions and programming examples I found so far refer to examples that don't collect the results in the parent function. Does anyone know how to adjust my code to allow for fully parralel computation and direct use of the results in the parent function (storing them as arrays)? As an example, I have a code sample that I copied from an online tutorial. It works perfectly fine until I switch to parallel (InvocationType = 'Event') invocation.

Parent function (this is where I change the InvocationType from 'RequestResponse' to 'Event'):

import json
import boto3
 
client = boto3.client('lambda')
 
def lambda_handler(event,context):
 
    # Creation of simple inputs for the child function

    for i in range(0,10):
    
        quantity = i
        price = i
        
        inputParams = {
            "Quantity"      : quantity,
            "UnitPrice"     : price
        }
     
    
        response = client.invoke(
            FunctionName = 'arn:aws:lambda:REGION-and-ID:function:ChildFunction',
            InvocationType = 'RequestResponse',
            Payload = json.dumps(inputParams)
        )
     
        responseFromChild = json.load(response['Payload'])
 
        # Simply printing the output without processing it

        print('\n')
        print(responseFromChild['Amount'])

Child function (simplified, processes my repeating calculation):

import json

def lambda_handler(event, context):
    
    # Read the input parameters

    quantity    = event['Quantity']
    unitPrice   = event['UnitPrice']
 
    # Define output array and perform calculation

    amount = []
    
    for i in range (0, 3):
        amount.append(quantity * unitPrice * i)
    
 
    # Format and return the result

    return {
        'Amount'        :   amount
    }
1
Hi, welcome on stackoverflow. Do not hesitate to visit tour and How to Ask. Also, please edit your question in order to improve the readability, other users will be more willing to help.Dorian Turba

1 Answers

1
votes

Unfortunately like you mentioned, asynchronous invocation (the event invoke type) works in the way that might not suit your use-case since the data cannot be used by the parent function easily.

While this problem can be solved by using the RequestResponse invoke type in. combination with async programming, I think the more elegant solution here would be to use a feature called 'Dynamic Parallelism' of AWS step functions.

You can basically define a step function with your parent lambda as the first lambda, and a second, parallel, lambda execution step. Then a third lambda collects the results and processes them.

If you're a bit familiar with step functions and the amazon state language, you can define such a 'parallel' stage as follows:

...
    "ProcessAllItems": {
      "Type": "Map",
      "InputPath": "$.detail",
      "ItemsPath": "$.inputData",
      "MaxConcurrency": 100,
      "Iterator": {
        "StartAt": "ChildFunction",
        "States": {
          "ChildFunction": {
            "Type": "Task",
            "Resource": "arn:aws:lambda:us-west-2:123456789012:function:ChildFunction",
            "End": true
          }
        }
      },
...

Then the workflow looks like the following:

  1. Prepare parallel input data (ParentLambdaPrep) =>
    {
      "inputData": [
        {...},
        {...},
        {...}
       ] 
    }
    
  2. Run ChildFunction concurrently based on inputData.
  3. Process results from parallel executions.

It's a bit of trial and error in what format the input data should be given, and how the concurrent lambda's receive it, and then how the third lambda receives the data. However, architecturally it's quite sound.

More info on dynamic parallelism can be found here: https://aws.amazon.com/blogs/aws/new-step-functions-support-for-dynamic-parallelism/.