6
votes

I have 3 python scripts which I want to schedule to run at different times in AWS. Currently, I have those 3 scripts residing in an EC2 instance and I use cron to run them. The first and second scripts download some data to a specific directory on the EC2 box (say at /home/ec2-user/data). The third one uses the downloaded data to run.

Occasionally, one of the first two scripts fails causing the third to fail too. However, I have no way of retrying the failed scripts through cron, unless I build the failure recovery logic in the scripts. Also, I am not happy about using an EC2 instance. It is not a good solution. It would be better to use AWS service for this.

I want to know if AWS Lambda is a good service to use here? If so, how do I specify where to download the data to, and where to have the third script read data from?

Or is there another service in AWS that could best fit this scenario?

2

2 Answers

3
votes

Yes you can use aws lambda for this. You can use s3 for your data storage needs.

One limitation you may need to consider is the max time allowed for execution of the job would be 300 seconds.

Reference: http://docs.aws.amazon.com/lambda/latest/dg/with-scheduled-events.html

0
votes

Check the Worker Environment from AWS Elastic Beanstalk. This will start an EC2 and a SQS queue, automatically managed by AWS EB - Docs, and a simple tutorial. In the current context, the scripts must communicate somehow because the third script is conditioned by the second. How much time does it takes to run the scripts? Can you merge the scripts? Please give more details.