11
votes

Is there a way to copy an S3 bucket including the versions of objects?

I read that a way to copy a bucket is by using the command line tool with

aws s3 sync s3://<source> s3://<dest>

However, on in the source bucket I had:

enter image description here

while in the synced bucket I have:

enter image description here

As you can see the Version ID is "null". Is there a way to make a 100% identical copy, including the version ID? This would be important for our backups / development server, as our app is relying on the version ID. Edit: If I turn on versioning before the sync I get version Id's different from null. But the version Ids differ from the ones in the original bucket and the goal would be to preserve the version Ids. I also tried the cp command, it yields the same results. The result described above is also documented here:

If you enable versioning on the target bucket, Amazon S3 generates a unique version ID for the object being copied. This version ID is different from the version ID of the source object. Amazon S3 returns the version ID of the copied object in the x-amz-version-id response header in the response.

If you do not enable versioning or suspend it on the target bucket, the version ID Amazon S3 generates is always null.

So it looks like aws doesn't provide a way to preserve version Ids? If so, is there a workaround or third party software for this?

2
Hey, did you find a way to make exact replicas without using CRR? Did you manage to sync and preserve the version IDs? I currently have the exact same issue...Joze
@Joze no, unfortunately we didn't find a solution to this problembersling

2 Answers

1
votes
  1. No, You can't get the same version id as they are unique like the bucket name. That means that no third party tool can edit and give you the same version ID, even if you preserve it. AWS assigns the version id and cannot be changed via APIs.

  2. If the Versioning is not enabled on the new bucket, the object version id will be reported as NULL. As you have mentioned this is right. What you can do is store the old version id and the new version id and treat it as an alias.

Here is the AWS Documentation explanation for it:

You must explicitly enable versioning on your bucket. By default, versioning is disabled. Regardless of whether you have enabled versioning, each object in your bucket has a version ID. If you have not enabled versioning, Amazon S3 sets the value of the version ID to null. If you have enabled versioning, Amazon S3 assigns a unique version ID value for the object. When you enable versioning on a bucket, objects already stored in the bucket are unchanged. The version IDs (null), contents, and permissions remain the same.

Enabling and suspending versioning is done at the bucket level. When you enable versioning for a bucket, all objects added to it will have a unique version ID. Unique version IDs are randomly generated, Unicode, UTF-8 encoded, URL-ready, opaque strings that are at most 1024 bytes long. An example version ID is 3/L4kqtJlcpXroDTDmJ+rmSpXd3dIbrHY+MTRCxf3vjVBH40Nr8X8gdRQBpUMLUo. Only Amazon S3 generates version IDs. They cannot be edited.

0
votes

There is no direct way to do so . But you can do it via AWS Copy-Object API ref https://docs.aws.amazon.com/cli/latest/reference/s3api/copy-object.html

Iterate on every version object, capture the object-id and copy to destination bucket.

Hope this helps