0
votes

I am trying to upload very large file into Azure Blob store. I am using Java with Azure SDK. I want to do this in following manner -

  1. Split the large file in smaller chunks.
  2. Upload all the chunks in parallel to Azure Blob Store.

I can do these tasks by writing the code to split and upload using multiple threads but I am looking for something provided by the SDK itself to do the above tasks. The examples or samples that I found where doing these steps explicitly. As of now, I am uploading using following code -

blob.upload(stream, stream.length());

Can someone please point me in the right direction?

UPDATE -

After inputs from @Gaurav Mantri, my updated code looks like

BlobRequestOptions blobRequestOptions = new BlobRequestOptions();
blobRequestOptions.setConcurrentRequestCount(8);
blobRequestOptions.setSingleBlobPutThresholdInBytes(65000000);
blob.upload(stream, stream.length(), null, blobRequestOptions ,null);

I am uploading file of size 3GB. Still I am not sure whether the file is getting split or not because there is no improvement in time taken for upload.

1
the SDK copy is synchronous. better performance would be achieved thru AzCopy, if you are copying between two azure storage accounts. Call the utility from Java to achieve the desired result. - harishr
@harishr I am uploading a very large file from my local windows machine to Azure Blob Store. Can I use AzCopy in this case as well? - abi_pat
AzCopy can certainly be used for this purpose. This utility is meant for that purpose only. - Gaurav Mantri

1 Answers

1
votes

You should take a look at BlobRequestOptions class in Java SDK. There are two methods there that are of your interest:

Sets the concurrent number of simultaneous requests per operation. The default concurrent request count is set in the client and is by default 1, indicating no concurrency.

Sets the threshold size used for writing a single blob to use. The default threshold size is set in the client and is by default 32MB. You can change the threshold size on this request by setting this property.

What this property will do is that if it finds that blob you're trying to upload is of size larger than the value specified, SDK will automatically break that blob into chunks and upload those chunks.