2
votes

This template creates SSM parameter variable and then tries to delete it after 5 minutes. The dependent template is not able to delete the function and therefore both the stacks fail to delete. I will like to know how to remove stacks after time to live (5 minutes in this case)

AWSTemplateFormatVersion: '2010-09-09'
Description: Demo stack, creates one SSM parameter and gets deleted after 5 minutes.
Resources:
  DemoParameter:
    Type: "AWS::SSM::Parameter"
    Properties:
      Type: "String"
      Value: "date"
      Description: "SSM Parameter for running date command."
      AllowedPattern: "^[a-zA-Z]{1,10}$"
  DeleteAfterTTLStack:
    Type: "AWS::CloudFormation::Stack"
    Properties:
      TemplateURL: 'https://datameetgeobk.s3.amazonaws.com/cftemplates/cfn-stack-ttl_updated.yaml.txt'
      Parameters:
        StackName: !Ref 'AWS::StackName'
        TTL: '5'

I got this template from:

https://aws.amazon.com/blogs/infrastructure-and-automation/scheduling-automatic-deletion-of-aws-cloudformation-stacks/

2
What is your actual goal? Are you simply wanting to create a 'temporary' value in Parameter Store? Does this definitely need to be done via CloudFormation? Please tell us more about your wider use-case. - John Rotenstein
I buy instances on spot using CFN templates. I want to remove the instances after 1 or 2 days. This would do it automatically! As per my knowledge, there is no way to terminate running instance (spot or otherwise) exactly after 48 hours. right? - shantanuo

2 Answers

5
votes

There seems to be a lot of issues with that blog posts' template files. You are unable to delete the stack because the iam role in the nested stack does not have enough permissions to delete all of the resources in the stacks (lambda, iam roles, events, ssm parameters, etc).

To fix this error with the permissions, you need to create a new nested template with the additional permissions for your DeleteCFNLambdaExecutionRole. I have provided the update with the managed policy arn:aws:iam::aws:policy/AdministratorAccess, but I highly recommend finding out the least privileges to delete your resources. The policy I added is not good practice, but since I do not know your full use case it's the only way to guarantee it will delete everything.

AWSTemplateFormatVersion: '2010-09-09'
Description: Schedule automatic deletion of CloudFormation stacks
Metadata:
  AWS::CloudFormation::Interface:
    ParameterGroups:
      - Label:
          default: Input configuration
        Parameters:
          - StackName
          - TTL
    ParameterLabels:
      StackName:
        default: Stack name
      TTL:
        default: Time-to-live
Parameters:
  StackName:
    Type: String
    Description: Stack name that will be deleted.
  TTL:
    Type: Number
    Description: Time-to-live in minutes for the stack.
Resources:
  DeleteCFNLambdaExecutionRole:
    Type: "AWS::IAM::Role"
    Properties:
      AssumeRolePolicyDocument:
        Version: "2012-10-17"
        Statement:
        - Effect: "Allow"
          Principal:
            Service: ["lambda.amazonaws.com"]
          Action: "sts:AssumeRole"
      Path: "/"
      ManagedPolicyArns:
        - 'arn:aws:iam::aws:policy/AdministratorAccess'
  DeleteCFNLambda:
    Type: "AWS::Lambda::Function"
    DependsOn:
      - DeleteCFNLambdaExecutionRole
    Properties:
      FunctionName: !Sub "DeleteCFNLambda-${StackName}"
      Code:
        ZipFile: |
          import boto3
          import os
          import json

          stack_name = os.environ['stackName']

          def delete_cfn(stack_name):
              try:
                  cfn = boto3.resource('cloudformation')
                  stack = cfn.Stack(stack_name)
                  stack.delete()
                  return "SUCCESS"
              except:
                  return "ERROR" 

          def handler(event, context):
              print("Received event:")
              print(json.dumps(event))
              return delete_cfn(stack_name)
      Environment:
        Variables:
          stackName: !Ref 'StackName'
      Handler: "index.handler"
      Runtime: "python3.6"
      Timeout: "5"
      Role: !GetAtt DeleteCFNLambdaExecutionRole.Arn
  DeleteStackEventRule:
     DependsOn:
       - DeleteCFNLambda
       - GenerateCronExpression
     Type: "AWS::Events::Rule"
     Properties:
       Description: Delete stack event
       ScheduleExpression: !GetAtt GenerateCronExpression.cron_exp
       State: "ENABLED"
       Targets: 
          - 
            Arn: !GetAtt DeleteCFNLambda.Arn
            Id: 'DeleteCFNLambda' 
  PermissionForDeleteCFNLambda: 
    Type: "AWS::Lambda::Permission"
    Properties: 
      FunctionName: !Sub "arn:aws:lambda:${AWS::Region}:${AWS::AccountId}:function:DeleteCFNLambda-${StackName}"
      Action: "lambda:InvokeFunction"
      Principal: "events.amazonaws.com"
      SourceArn: !GetAtt DeleteStackEventRule.Arn
  BasicLambdaExecutionRole:
    Type: "AWS::IAM::Role"
    Properties:
      AssumeRolePolicyDocument:
        Version: "2012-10-17"
        Statement:
        - Effect: "Allow"
          Principal:
            Service: ["lambda.amazonaws.com"]
          Action: "sts:AssumeRole"
      Path: "/"
      Policies:
      - PolicyName: "lambda_policy"
        PolicyDocument:
          Version: "2012-10-17"
          Statement:
          - Effect: "Allow"
            Action:
            - "logs:CreateLogGroup"
            - "logs:CreateLogStream"
            - "logs:PutLogEvents"
            Resource: "arn:aws:logs:*:*:*"
  GenerateCronExpLambda:
    Type: "AWS::Lambda::Function"
    Properties:
      Code:
        ZipFile: |
          from datetime import datetime, timedelta
          import os
          import logging
          import json
          import cfnresponse

          def deletion_time(ttl):
              delete_at_time = datetime.now() + timedelta(minutes=int(ttl))
              hh = delete_at_time.hour
              mm = delete_at_time.minute
              cron_exp = "cron({} {} * * ? *)".format(mm, hh)
              return cron_exp

          def handler(event, context):
            print('Received event: %s' % json.dumps(event))
            status = cfnresponse.SUCCESS
            try:
                if event['RequestType'] == 'Delete':
                    cfnresponse.send(event, context, status, {})
                else:
                    ttl = event['ResourceProperties']['ttl']
                    responseData = {}
                    responseData['cron_exp'] = deletion_time(ttl)
                    cfnresponse.send(event, context, cfnresponse.SUCCESS, responseData)
            except Exception as e:
                logging.error('Exception: %s' % e, exc_info=True)
                status = cfnresponse.FAILED
                cfnresponse.send(event, context, status, {}, None)
      Handler: "index.handler"
      Runtime: "python3.6"
      Timeout: "5"
      Role: !GetAtt BasicLambdaExecutionRole.Arn

  GenerateCronExpression:
    Type: "Custom::GenerateCronExpression"
    Version: "1.0"
    Properties:
      ServiceToken: !GetAtt GenerateCronExpLambda.Arn
      ttl: !Ref 'TTL'

Once you made this change you will then need to upload to s3 and update the reference in the main stack to your version of the template.

AWSTemplateFormatVersion: '2010-09-09'
Description: Demo stack, creates one SSM parameter and gets deleted after 5 minutes.
Resources:
  DemoParameter:
    Type: "AWS::SSM::Parameter"
    Properties:
      Type: "String"
      Value: "date"
      Description: "SSM Parameter for running date command."
      AllowedPattern: "^[a-zA-Z]{1,10}$"
    DependsOn: DeleteAfterTTLStack
  DeleteAfterTTLStack:
    Type: "AWS::CloudFormation::Stack"
    Properties:
      TemplateURL: 'https://your-bucket.s3.amazonaws.com/delete_resources.yaml'
      Parameters:
        StackName: !Ref 'AWS::StackName'
        TTL: '5'

You may need to add the DependsOn: DeleteAfterTTLStack field to each resource in order to make sure the permissions are not delete before all resources are removed otherwise permission errors can occur.

Even though this should work, I agree with @John Rotenstein that cloudformation may not be the best solution. For one, managing the permissions can be a huge pain point. It's easy to grant too much or too little permissions when configuring this template.

0
votes

A simple way to terminate an instance after a given time period is to run a command on the instance itself that sleeps for the desired period, then terminates the instance. This script can be passed-in via User Data.

The termination can be done in two ways:

  • Option 1: When launching the instance, set Shutdown behavior = Terminate, then have the script shutdown the instance
  • Option 2: Have the script retrieve the instance ID via metadata, then issue a terminate-instances command, passing the ID (this method requires AWS credentials/role)

Another popular technique is to create a stopinator, which is a script or Lambda function running on a regular basis that checks tags on instances to determine whether to stop/terminate instances. For example:

  • Run an AWS Lambda function every 6 hours
  • The function should look at all instances to identify instances to stop/terminate based on tags, running time, etc (basically, whatever you want)
  • The function then stops/terminates the instance(s)