3
votes

While reading over this S3 Lifecycle Policy document I see that it's possible to delete an S3 object containing a particular key=value pair e.g.,

<LifecycleConfiguration>
    <Rule>
        <Filter>
           <Tag>
              <Key>key</Key>
              <Value>value</Value>
           </Tag>
        </Filter>
        transition/expiration actions.
        ...
    </Rule>
</LifecycleConfiguration>

But is it possible to create a similar rule that deletes any object NOT in the key=value pair? For example, anytime my object is accessed I could update it's tag with the days current date e.g., object-last-accessed=07-26-2019. Then I could create a Lambda function that deletes the current S3 Lifecycle policy each day and then create a new lifecycle policy that has a tag for each of the last 30 days, then my lifecycle policy would automatically delete any object that has not been accessed in the last 30 days; anything that was accessed longer than 30 days would have a date value older than any value inside the lifecycle policy and hence it would get deleted.

Here's an example of what I desire (note I added the desired field <exclude>,

<LifecycleConfiguration>
    <Rule>
        <Filter>
           <exclude>
              <Tag>
                 <Key>last-accessed</Key>
                 <Value>07-30-2019</Value>
              </Tag>
              ...
              <Tag>
                 <Key>last-accessed</Key>
                 <Value>07-01-2019</Value>
              </Tag>
           <exclude>
        </Filter>
        transition/expiration actions.
        ...
    </Rule>
</LifecycleConfiguration>

Is something like my made up <exclude> value possible? I want to delete any S3 Object that has not been accessed in the last 30 days (that's different than an object which is older than 30 days).

2

2 Answers

1
votes

From what I understand, this is possible but via a different mechanism.

My solution is to take a slightly different approach and set a tag on every object and then alter that tag as you need. So in your instance when the object is created set object-last-accessed to "default" do that through an S3 trigger to a piece of Lambda or when the object is written to S3.

When the object is accessed, then update the tag value to the current date.

If you already have a bucket full of objects, you can use S3 batch to set the tag to the current date and use that as a delta reference point from which to assume files were last accessed

https://docs.aws.amazon.com/AmazonS3/latest/API/API_PutObjectTagging.html

Now set the lifecycle rule to remove objects with a tag of "default" after 10 days (or whatever you want). Add additional rules to remove files with a tag of a date 10 days after that date. You will need to update the lifecycle rule periodically, but you can create 1000 at a time. this doc gives details of the formal for a rule https://docs.aws.amazon.com/AmazonS3/latest/API/API_LifecycleRule.html I'd suggest something like this

<LifecycleConfiguration>
    <Rule>
        <ID>LastAccessed Default Rule</ID>
        <Filter>
            <Tag>
                <Key>object-last-accessed</Key>
                <Value>default</Value>
            </Tag>
        </Filter>
        <Status>Enabled</Status>
        <Expiration>
            <Days>10</Days> 
        </Expiration>
    </Rule>
    <Rule>
        <ID>Last Accessed 2020-05-19 Rule</ID>
        <Filter>
            <Tag>
                <Key>object-last-accessed</Key>
                <Value>2020-05-19</Value>
            </Tag>
        </Filter>
        <Status>Enabled</Status>
        <Expiration>
            <Date>2020-05-29</Date> 
        </Expiration>
    </Rule>
</LifecycleConfiguration>
1
votes

Reading further on this, as I'm faced with this problem, an alternative is to just use the object lock retention mode which allows you to set a default retention on a bucket and then change that retention period as the file is accessed. This works at an version level, i.e. each version is retained for a period not the whole file, so may not be suitable for all. more details in https://docs.aws.amazon.com/AmazonS3/latest/dev/object-lock-overview.html#object-lock-retention-modes