19
votes

I have a lambda function that reads from DynamoDB and creates a large file (~500M) in /tmp that finally uploaded to s3. Once uploaded the lambda clears the file from /tmp (since there is a high probability that the instance may be reused)

This function takes about 1 minute to execute, even if you ignore the latencies.

In this scenario, when i try to invoke the function again, in < 1m, i have no control if i will have enough space to write to /tmp. My function fails.

Questions: 1. What are the known work arounds in these kind of scenario? (Potentially give more space in /tmp or ensure a clean /tmp is given for each new execution) 2. What are the best practices regarding file creation and management in Lambda? 3. Can i attach another EBS or other storage to Lambda for execution ? 4. Is there a way to have file system like access to s3 so that my function instead of using /tmp can write directly to s3?

4
I don't see why you NEED a file (or file system), especially considering that you're using FaaS/Amazon Lambda. Could you rewrite your code so that the DynamoDB output is streamed to S3 without writing it to disk? - C-Otto
There is a lot of processing that needs to be done.not just a simple dump from dynamo to s3 - sandeepzgk
Maybe you're just hitting the limit (512M) then? docs.aws.amazon.com/lambda/latest/dg/limits.html It might help to work in-memory, or add a third service for temporary storage in between. - C-Otto
@usama not really, the best way is to clean up after you use it, else may be just clean before use. - sandeepzgk

4 Answers

10
votes

I doubt that two concurrently running instances of AWS Lambda will share /tmp or any other local resource, since they must execute in complete isolation. Your error should have a different explanation. If you mean, that a subsequent invocation of AWS Lambda reuses the same instance, then you should simply clear /tmp on your own.

In general, if your Lambda is a resource hog, you better do that work in an ECS container worker and use the Lambda for launching ECS tasks, as described here.

4
votes

You are likely running into the 512 MB /tmp limit of AWS Lambda.

You can improve your performance and address your problem by storing the file in-memory, since the memory limit for Lambda functions can go as high as 1.5 GB.

0
votes

Starting March 2022, Lambda now supports increasing /tmp directory's maximum size limit up to 10,240MB. More information available here.

0
votes

Now it is even easy, AWS storage can be increased to 10GB named Ephemeral Storage. It is available in general configuration of the AWS lambda functions.