1
votes

I want to transfer a huge file from s3 (4.5GB of Size) to Azure blob Storage as a single file. Since it is a huge file we are trying to break that s3 file into multiple chunks of block size 5MB. Each chunk is uploaded to azure blob storage and at the end i wanted reassemble all the chunks into single blob file in azure (or) every chunk which i'm uploading to azure blob storage should append it to the existing one.

Please help me out soon if we have any solution for this?

1
why not just use azcopy to do all the work for you? docs.microsoft.com/en-us/azure/storage/common/…silent
Firstly, thanks for this solution. This way works using Azure CLI. We don't prefer the option of sharing the Keys of Amazon access. We wanted to multipart download as a stream from Amazon s3 and send it back to azure blob storage as a stream by stream. Is there a way to upload multiple chunks to single file blob in Azure storage?Sriram

1 Answers

1
votes

Azure Storage supports chunked upload through Block Blobs. You can "stage" chunks of data and then when you are done uploading all chunks, you can "commit" the staged chunks to a single blob.

The new azure-storage-blob Java SDK provides BlockBlobClient (and BlockBlobAsyncClient) that has APIs to stage and commit blocks.

Use the SpecializedBlobClientBuilder to create an instance of BlockBlobClient.

Here's a sample:

BlockBlobClient blockBlobClient = new SpecializedBlobClientBuilder()
        .connectionString("<your-connection-string>")
        .containerName("<your-container-name>")
        .blobName("<your-blob-name>")
        .buildBlockBlobClient();

String chunkId1 = Base64.getEncoder().encodeToString("1".getBytes());
String chunkId2 = Base64.getEncoder().encodeToString("2".getBytes());
String chunkId3 = Base64.getEncoder().encodeToString("3".getBytes());

byte[] chunk1Bytes = " chunk 1.".getBytes();
byte[] chunk2Bytes = " chunk 2.".getBytes();
byte[] chunk3Bytes = " chunk 3.".getBytes();

ByteArrayInputStream chunk1 = new ByteArrayInputStream(chunk1Bytes);
ByteArrayInputStream chunk2 = new ByteArrayInputStream(chunk2Bytes);
ByteArrayInputStream chunk3 = new ByteArrayInputStream(chunk3Bytes);

// Stage 3 blocks
blockBlobClient.stageBlock(chunkId1, chunk1, chunk1Bytes.length);
blockBlobClient.stageBlock(chunkId2, chunk2, chunk2Bytes.length);
blockBlobClient.stageBlock(chunkId3, chunk3, chunk3Bytes.length);

// Commit all 3 blocks - order of chunkIds matter
BlockBlobItem blockBlobItem = blockBlobClient.commitBlockList(Arrays.asList(chunkId1, chunkId2, chunkId3));


ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
blockBlobClient.download(outputStream);

System.out.println(new String(outputStream.toByteArray())); // prints chunk 1. chunk 2. chunk3