1
votes

I have attached how my state machine looks and also sharing the definition for reference. The Service Status Retry Wait state waits for 60 secs before calling the Service Status Retry state which invokes a lambda to check if the service was created in the Service Created? state.

{
  "Comment": "Step Function for Deployment",
  "StartAt": "Start",
  "States": {
    "Start": {
      "Type": "Pass",
      "Next": "Service Status"
    },
    "Service Status": {
      "Type": "Parallel",
      "Branches": [
        {
          "StartAt": "Service Status Check",
          "States": {
            "Service Status Check": {
              "Type": "Task",
              "Resource": "arn:aws:states:::lambda:invoke",
              "Parameters": {
                "FunctionName": "arn:aws:lambda:us-west-2:<acc-id>:function:<fn-name>",
                "Payload": {
                  "stepFunctionState.$": "$$.State.Name",
                  "requestContext": {
                    "stepFunction": true,
                    "accountId": "<acc-id>"
                  },
                  "resource.$": "States.Format('{}/srvc', $$.Execution.Input.resource)",
                  "path.$": "States.Format('{}/srvc', $$.Execution.Input.path)",
                  "httpMethod": "POST",
                  "pathParameters.$": "$$.Execution.Input.pathParameters",
                  "body.$": "$$.Execution.Input.body",
                  "isBase64Encoded": false,
                  "headers": {},
                  "queryStringParameters": null
                }
              },
              "ResultPath": "$.taskresult",
              "Next": "Service Created?"
            },
            "Service Created?": {
              "Type": "Choice",
              "Choices": [
                {
                  "Variable": "$.taskresult.Payload.body",
                  "StringMatches": "*\"isCreated\": \"false\"*",
                  "Next": "Service Status Retry Wait"
                },
                {
                  "Variable": "$.taskresult.Payload.body",
                  "StringMatches": "*\"isCreated\": \"true\"*",
                  "Next": "Service Status Check End"
                }
              ],
              "Default": "Service Status Retry Wait"
            },
            "Service Status Retry Wait": {
              "Type": "Wait",
              "Seconds": 60,
              "Next": "Service Status Retry"
            },
            "Service Status Retry": {
              "Type": "Task",
              "Resource": "arn:aws:states:::lambda:invoke",
              "Parameters": {
                "FunctionName": "arn:aws:lambda:us-west-2:<acc-id>:function:<func-name>",
                "Payload": {
                  "stepFunctionState.$": "$$.State.Name",
                  "requestContext": {
                    "stepFunction": true,
                    "accountId": "<acc-id>"
                  },
                  "resource.$": "States.Format('{}/srvc', $$.Execution.Input.resource)",
                  "path.$": "States.Format('{}/srvc', $$.Execution.Input.path)",
                  "httpMethod": "POST",
                  "pathParameters.$": "$$.Execution.Input.pathParameters",
                  "body.$": "$$.Execution.Input.body",
                  "isBase64Encoded": false,
                  "headers": {},
                  "queryStringParameters": null
                }
              },
              "ResultPath": "$.taskresult",
              "Next": "Service Created?"
            },
            "Service Status Check End": {
              "Type": "Pass",
              "End": true
            }
          }
        }
      ],
      "Next": "End"
    },
    "End": {
      "Type": "Pass",
      "End": true
    }
  }
}

Until and Unless the service is created, this state machine keeps going on in loops. I want to know if there is a way to break the loop and exit after 5 retries. It can be after either the Service Status Retry Wait or Service Status Retry was called 5 times.

Or if there is a way to know it has been looping for 1 hour now and exit based on that. Any of the above 2 options should work for me.

Any thoughts/ideas on how this can be achieved?enter image description here

1
It seems like you could use the error/retry feature built into Step Functions instead of recreating your own logic for this docs.aws.amazon.com/step-functions/latest/dg/… - Mark B
It explains about Retrying after an error. However in my case, I do not expect an error. I need to stop after my retry state has executed 5 times. - vkr
What I'm saying is you could configure your code that checks the service status to throw an error if the service hasn't started yet. Then you would be able to utilize this built-in retry handling functionality provided by Step Functions. - Mark B

1 Answers

1
votes

You have 2 options:

  1. Handle retry yourself by keeping number of retry and check it in your choice state: enter image description here

  2. Use built in error/retry. For that check if service is created or not and throw an error and wrap it in a parallel state with retry of 5:

{
  "Comment": "Step Function for Deployment",
  "StartAt": "Start",
  "States": {
    "Start": {
      "Type": "Pass",
      "Next": "Service Status"
    },
    "Service Status": {
      "Type": "Parallel",
      "Branches": [
        {
          "StartAt": "Service Status Check",
          "States": {
            "Service Status Check": {
              "Type": "Task",
              "Resource": "arn:aws:states:::lambda:invoke",
              "Parameters": {
                "FunctionName": "arn:aws:lambda:us-west-2:<acc-id>:function:<fn-name>",
                "Payload": {
                  "stepFunctionState.$": "$$.State.Name",
                  "requestContext": {
                    "stepFunction": true,
                    "accountId": "<acc-id>"
                  },
                  "resource.$": "States.Format('{}/srvc', $$.Execution.Input.resource)",
                  "path.$": "States.Format('{}/srvc', $$.Execution.Input.path)",
                  "httpMethod": "POST",
                  "pathParameters.$": "$$.Execution.Input.pathParameters",
                  "body.$": "$$.Execution.Input.body",
                  "isBase64Encoded": false,
                  "headers": {},
                  "queryStringParameters": null
                }
              },
              "ResultPath": "$.taskresult",
              "Next": "Service Created?"
            },
            "Service Created?": {
              "Type": "Choice",
              "Choices": [
                {
                  "Variable": "$.taskresult.Payload.body",
                  "StringMatches": "*\"isCreated\": \"false\"*",
                  "Next": "Throw Error"
                },
                {
                  "Variable": "$.taskresult.Payload.body",
                  "StringMatches": "*\"isCreated\": \"true\"*",
                  "Next": "Succeed"
                }
              ],
              "Default": "Throw Error"
            },
            "Throw Error": {
              "Type": "Fail"
            },
            "Succeed": {
              "Type": "Pass",
              "End": true
            }
          }
        }
      ],
      "Next": "End",
      "Retry": [
        {
          "ErrorEquals": [
            "States.ALL"
          ],
          "IntervalSeconds": 3,
          "MaxAttempts": 5,
          "BackoffRate": 1.5
        }
      ]
    },
    "End": {
      "Type": "Pass",
      "End": true
    }
  }
}

enter image description here