I've written a lambda function that is triggered via an s3 bucket's putObject event. I am modifying the headers of an object post upload, downloading the object, and reuploading with appropriate headers. But because the function itself uses the putObject to reupload the object, the lambda triggers itself.
1 Answers
Three options:
Use a different API to upload your changes than the one that you have an event on. ie, if your lambda is triggered by PUT, then use a POST to modify the content afterwards (tough to do since POST isn't supported well by SDKs AFAIK, so this may not be an option).
Track usage and have a small guard at the beginning of your handler to short circuit if the only changes made to a file are ones you made. If you can't programmatically detect the headers you've set, you'll probably need a small dynamo table or similar for keeping track of which files you've already touched. This will let you abort immediately and only be charged the minimum 100ms fee.
Reorganize your project to have an 'ingest' bucket and an output bucket. Un-processed are put into the former, modified, and then placed into the latter. This has a number of advantages. The first is that you don't end up with the current situation, so that's a plus. The second is that you don't have whatever process consumes these modified files potentially pulling an unmodified version. The third is that you get better insight into the process - if something goes wrong, it's easy to see which batches of files have undergone which process.
Overall, I'd recommend option 3 for you, though I know that in my lazier moments I might try to opt for 1 or 2.
Either way, good luck.