1
votes

I'm trying out the AWS step function. What I'm trying to create.

  1. Get a list of endpoints from the dynamoDB (https://user:[email protected], https://user2:[email protected], etc..)
  2. From each domain, I get a list of ids. /all
  3. For each id in the result, I want to do a series of REST etc https://user:[email protected]/device/{id} (Only one request at the time to each domain in parallel)
  4. Save the result in a dynamoDB and check if it is duplicated result or not.

I know how to make the rest call and saving to the dynamoDB etc. But the problem or unable to find the answer to is. How can I start run /all in parallel for each domain in the array I get from the dynamoDB?

1
Have you looked at github.com/caolan/async? Specifically async.each. - dzm
pseudo code: [start function step] -> [Lambda function 1 gets domain list -> stores them to SQS] -> [PARALLEL: each Lambda function 2 gets one domain from SQS -> processes what you want with each domain] -> [end function step] - Bui Anh Tuan

1 Answers

3
votes

AWS Step Functions have an immutable state. Once created, they cannot be changed. Given this fact, you cannot have a dynamic number of branches in your Parallel state.

To solve for this, you'll probably want to approach your design a little differently. Instead of solving this with a single Step Function, consider breaking it apart into two different state machines, as shown below.

Step Function #1: Retrieve List of Endpoints

  1. Start
  2. Task: Retrieves list of endpoints from DynamoDB
  3. Task: For each endpoint, invoke Step Function #2 and pass in endpoint
  4. End

You could optionally combine states #2 and #3 to simplify the state machine and your task code.

Step Function #2: Perform REST API Calls

  1. Start - takes a single endpoint as input state
  2. Task: Perform series of REST calls against endpoint
  3. Task: Stores result in DynamoDB via Task state
  4. End