1
votes

Objective: Whenever an object is stored in the bucket, trigger a batch job (aws batch) and pass the uploaded file url as an environment variable

Situation: I currently have everything set up. I've got the s3 bucket with cloudwatch triggering batch jobs, but I am unable to get the full file url or to set environment variables.

I have followed the following tutorial: https://docs.aws.amazon.com/batch/latest/userguide/batch-cwe-target.html "To create an AWS Batch target that uses the input transformer".

AWS Cloudwatch input transformer

The job is created and processed in AWS batch, and under the job details, i can see the parameters received are:

S3bucket: mybucket
S3key: view-0001/custom/2019-08-07T09:40:04.989384.json

But the environment variables have not changed, and the file URL does not contain all the other parameters such as access and expiration tokens.

I have also not found any information about what other variables can be used in the input transformer. If anyone has a link to a manual, it would be welcome.

Also, in the WAS CLI documentation, it is possible to set the environment variables when submitting a job, so i guess it should be possible here as well? https://docs.aws.amazon.com/cli/latest/reference/batch/submit-job.html

So the question is, how to submit a job with the file url as an environment variable?

1

1 Answers

0
votes

You could accomplish this by triggering a Lambda function off the bucket and generating a pre-signed URL in the Lambda function and starting a Batch job from the Lambda function.

However, a better approach would be to simply access the file within the Batch function using the bucket and key. You could use the AWS SDK for your language or simply use awscli. For example you could download the file:

aws s3 cp s3://$BUCKET/$KEY /tmp/file.json

On the other hand, if you need a pre-signed URL outside of the Batch function, you could generate one with the AWS SDK or awscli:

aws s3 presign s3://$BUCKET/$KEY

With either of these approaches with accessing the file within the Batch job, you will need to configure the instance role of your Batch compute environment with IAM access to your S3 bucket.