Generally-speaking, Terraform providers reflect operations that are natively supported by the underlying APIs, but in some cases we can use various Terraform resource types together to achieve functionality that the underlying provider lacks.
I believe there's no native S3 operation for bulk-copying objects from one bucket to another, so to solve this with Terraform requires decomposing the problem into smaller steps, which I think in this case would be:
- Declare a new bucket, the target
- List all of the objects in the source bucket
- Declare one object in the new bucket per object in the source bucket.
The AWS provider can in principle do all three of these operations: it has managed resource types for both buckets and bucket objects, and it has a data source aws_s3_bucket_objects
which can enumerate some or all of the objects in a bucket.
We can combine those pieces together in a Terraform configuration like this:
resource "aws_s3_bucket" "target" {
bucket = "copy-example-target"
}
data "aws_s3_bucket_objects" "source" {
bucket = "copy-example-source"
}
data "aws_s3_bucket_object" "source" {
for_each = toset(data.aws_s3_bucket_objects.source.keys)
bucket = data.aws_s3_bucket_objects.source.bucket
key = each.key
}
resource "aws_s3_bucket_object" "target" {
for_each = aws_s3_bucket_object.source
bucket = aws_s3_bucket.target.bucket
key = each.key
content = each.value.body
}
With that said, Terraform is likely not the best tool to for this situation for the following reasons:
- The above configuration will cause Terraform to read all of the objects in the bucket into memory, which would be time consuming and use lots of RAM for larger buckets, and then ultimately store all of them in the Terraform state, which would make the state itself very large.
- Because the
aws_s3_bucket_object
data source is intended mainly for retrieving small text-based objects, the above will work only if everything in the bucket meets the limitations described in the aws_s3_bucket_object
documentation: the objects must all have text-indicating MIME types and they must all contain UTF-8 encoded text.
In this case then, I would prefer to use a specialized tool for the job which is designed to exploit all of the features of the S3 API to make the copy as efficient as possible, such as streaming the list of objects and streaming the contents of each object in chunks to avoid the need to have all of the data in memory at once. One such tool is in the AWS CLI itself, in the form of the aws s3 cp
command with the --recursive
option.