1
votes

I'm trying to upload big files (4Gb) to Azure Blob Storage, but it fails. According to this article (https://docs.microsoft.com/en-us/azure/storage/storage-dotnet-how-to-use-blobs), this is my code :

CloudBlobContainer blobContainer = blobClient.GetContainerReference("my-container-name");
blobContainer.CreateIfNotExistsAsync().Wait();
CloudBlockBlob blockBlob = blobContainer.GetBlockBlobReference("blob-name");
await blockBlob.UploadFromFileAsync("C:\test.avi");

But I got this error

Message: Stream was too long.
Source: System.Private.CoreLib
StackTrace: at System.IO.MemoryStream.Write(Byte[] buffer, Int32 offset, Int32 count) at Microsoft.WindowsAzure.Storage.Blob.BlobWriteStream.d__5.MoveNext() in C:\Program Files (x86)\Jenkins\workspace\release_dotnet_master\Lib\WindowsRuntime\Blob\BlobWriteStream.cs:line 144 --- End of stack trace from previous location where exception was thrown --- at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task) at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task) at Microsoft.WindowsAzure.Storage.Core.Util.StreamExtensions.d__1`1.MoveNext() in C:\Program Files (x86)\Jenkins\workspace\release_dotnet_master\Lib\Common\Core\Util\StreamExtensions.cs:line 308 --- End of stack trace from previous location where exception was thrown --- at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task) at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task) at Microsoft.WindowsAzure.Storage.Blob.CloudBlockBlob.<>c__DisplayClass20_0.<b__0>d.MoveNext() in C:\Program Files (x86)\Jenkins\workspace\release_dotnet_master\Lib\WindowsRuntime\Blob\CloudBlockBlob.cs:line 301 --- End of stack trace from previous location where exception was thrown --- at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task) at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task) at Microsoft.WindowsAzure.Storage.Blob.CloudBlockBlob.<>c__DisplayClass23_0.<b__0>d.MoveNext() in C:\Program Files (x86)\Jenkins\workspace\release_dotnet_master\Lib\WindowsRuntime\Blob\CloudBlockBlob.cs:line 397 --- End of stack trace from previous location where exception was thrown --- at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task) at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task) at System.Runtime.CompilerServices.TaskAwaiter.GetResult()
at MyCompany.AzureServices.Blob.BlobService.d__7.MoveNext() in C:\MyProjectSource\MyCompany.AzureServices\Blob\BlobService.cs:line 79 --- End of stack trace from previous location where exception was thrown --- at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task) at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task) at System.Runtime.CompilerServices.TaskAwaiter.GetResult()
at MyCompany.AzureServices.Blob.MyProject.RecordBlobService.<>c__DisplayClass1_0.<b__0>d.MoveNext() in C:\MyProjectSource\MyCompany.AzureServices\Blob\MyProject\RecordBlobService.cs:line 25

According to this article (https://www.simple-talk.com/cloud/platform-as-a-service/azure-blob-storage-part-4-uploading-large-blobs/) I try to add more option for large files. This is my new code :

TimeSpan backOffPeriod = TimeSpan.FromSeconds(2);
int retryCount = 1;
BlobRequestOptions bro = new BlobRequestOptions()
{
    //If the file to upload is more than 67Mo, we send it in multiple parts
    SingleBlobUploadThresholdInBytes = 67108864, //67Mo (maximum)
    //Number of threads used to send data
    ParallelOperationThreadCount = 1,
    //If the block fail, we retry once (retryCount) after 2 seconds (backOffPeriod)
    RetryPolicy = new ExponentialRetry(backOffPeriod, retryCount),
};
blobClient.DefaultRequestOptions = bro;
CloudBlockBlob blockBlob = blobContainer.GetBlockBlobReference("blob-name");
//If the file is sended in multiple parts, theses parts size are 4Mo
blockBlob.StreamWriteSizeInBytes = 4194304; //4Mo (maximum)
await blockBlob.UploadFromFileAsync("C:\test.avi");

But I got the same error again (Stream was too long).

I found in Microsoft.WindowsAzure.Storage library that the the function "UploadFromFileAsync" use "UploadFromStreamAsync" which use a MemoryStream. I think my error come from that MemoryStream but it's write in blob storage article that the max size of a blob is 195Gb. So how I'm suppose to use it ?

I use Microsoft.WindowsAzure.Storage version 7.2.1

Thanks !

.

UPDATE 1 : Thanks to Tom Sun and Zhaoxing Lu, I tried to use Microsoft.Azure.Storage.DataMovement.
Sadly, I get an error on the "TransferManager.UploadAsync" function. I tried to google it but nothing...
Any ideas ?

This is my code :

 string storageConnectionString = "myStorageConnectionString";
   string filePath = @"C:\LargeFile.avi";
   string blobName = "large_file.avi";

    CloudStorageAccount account = CloudStorageAccount.Parse(storageConnectionString);
    CloudBlobClient blobClient = account.CreateCloudBlobClient();
    CloudBlobContainer blobContainer = blobClient.GetContainerReference("mycontainer");
    blobContainer.CreateIfNotExists();

   CloudBlockBlob destBlob = blobContainer.GetBlockBlobReference(blobName);
    // Setup the number of the concurrent operations
    TransferManager.Configurations.ParallelOperations = 64;

    // Setup the transfer context and track the upload progress
    var context = new SingleTransferContext();
    UploadOptions uploadOptions = new UploadOptions
    {
        DestinationAccessCondition = AccessCondition.GenerateIfExistsCondition()
    };
    context.ProgressHandler = new Progress<TransferStatus>(progress =>
    {
        Console.WriteLine("Bytes uploaded: {0}", progress.BytesTransferred);
    });

    // Upload a local blob
    TransferManager.UploadAsync(filePath, destBlob, uploadOptions, context, CancellationToken.None).Wait();

This is the error :
Message : One or more errors occurred. (The transfer failed: The format of value '*' is invalid..)
Source : System.Private.CoreLib
StackTrace :

at System.Threading.Tasks.Task.ThrowIfExceptional(Boolean includeTaskCanceledExceptions) at System.Threading.Tasks.Task.Wait(Int32 millisecondsTimeout, CancellationToken cancellationToken) at System.Threading.Tasks.Task.Wait() at MyCompany.AzureServices.Blob.BlobService.d__7.MoveNext() in C:\MyProjectSource\MyCompany.AzureServices\Blob\BlobService.cs:line 96 --- End of stack trace from previous location where exception was thrown --- at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task) at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task) at System.Runtime.CompilerServices.TaskAwaiter.GetResult()
at MyCompany.AzureServices.Blob.MyProject.RecordBlobService.<>c__DisplayClass1_0.<b__0>d.MoveNext() in C:\MyProjectSource\MyCompany.AzureServices\Blob\MyProject\RecordBlobService.cs:line 25

And the inner exception :
Message : The transfer failed: The format of value '*' is invalid..
Source : Microsoft.WindowsAzure.Storage.DataMovement
StackTrace :

at Microsoft.WindowsAzure.Storage.DataMovement.TransferScheduler.d__22.MoveNext() in C:\Local\Jenkins\jobs\DM_Hotfix\workspace\lib\TransferScheduler.cs:line 214 --- End of stack trace from previous location where exception was thrown --- at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task) at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task) at Microsoft.WindowsAzure.Storage.DataMovement.SingleObjectTransfer.d__7.MoveNext() in C:\Local\Jenkins\jobs\DM_Hotfix\workspace\lib\TransferJobs\SingleObjectTransfer.cs:line 226 --- End of stack trace from previous location where exception was thrown --- at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task) at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task) at Microsoft.WindowsAzure.Storage.DataMovement.TransferManager.d__72.MoveNext() in C:\Local\Jenkins\jobs\DM_Hotfix\workspace\lib\TransferManager.cs:line 1263

And the next inner exception :
Message : The format of value '*' is invalid.
Source : Microsoft.WindowsAzure.Storage.DataMovement
StackTrace :

at Microsoft.WindowsAzure.Storage.DataMovement.TransferControllers.BlockBlobWriter.HandleFetchAttributesResult(Exception e) in C:\Local\Jenkins\jobs\DM_Hotfix\workspace\lib\TransferControllers\TransferWriters\BlockBlobWriter.cs:line 196 at Microsoft.WindowsAzure.Storage.DataMovement.TransferControllers.BlockBlobWriter.d__18.MoveNext() in C:\Local\Jenkins\jobs\DM_Hotfix\workspace\lib\TransferControllers\TransferWriters\BlockBlobWriter.cs:line 157 --- End of stack trace from previous location where exception was thrown --- at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task) at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task) at Microsoft.WindowsAzure.Storage.DataMovement.TransferControllers.BlockBlobWriter.d__16.MoveNext() in C:\Local\Jenkins\jobs\DM_Hotfix\workspace\lib\TransferControllers\TransferWriters\BlockBlobWriter.cs:line 83 --- End of stack trace from previous location where exception was thrown --- at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task) at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task) at Microsoft.WindowsAzure.Storage.DataMovement.TransferControllers.SyncTransferController.d__13.MoveNext() in C:\Local\Jenkins\jobs\DM_Hotfix\workspace\lib\TransferControllers\SyncTransferController.cs:line 81 --- End of stack trace from previous location where exception was thrown --- at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task) at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task) at Microsoft.WindowsAzure.Storage.DataMovement.TransferControllers.TransferControllerBase.d__33.MoveNext() in C:\Local\Jenkins\jobs\DM_Hotfix\workspace\lib\TransferControllers\TransferControllerBase.cs:line 178 --- End of stack trace from previous location where exception was thrown --- at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task) at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task) at Microsoft.WindowsAzure.Storage.DataMovement.TransferScheduler.d__22.MoveNext() in C:\Local\Jenkins\jobs\DM_Hotfix\workspace\lib\TransferScheduler.cs:line 208

And the last inner exception :
Message : The format of value '*' is invalid.
Source : System.Net.Http
StackTrace :

at System.Net.Http.Headers.HttpHeaderParser.ParseValue(String value, Object storeValue, Int32& index) at System.Net.Http.Headers.EntityTagHeaderValue.Parse(String input) at Microsoft.WindowsAzure.Storage.Shared.Protocol.RequestMessageExtensions.ApplyAccessCondition(StorageRequestMessage request, AccessCondition accessCondition) in C:\Program Files (x86)\Jenkins\workspace\release_dotnet_master\Lib\WindowsRuntime\Shared\Protocol\RequestMessageExtensions.cs:line 125 at Microsoft.WindowsAzure.Storage.Blob.CloudBlob.<>c__DisplayClass116_0.b__0(RESTCommand1 cmd, Uri uri, UriQueryBuilder builder, HttpContent cnt, Nullable1 serverTimeout, OperationContext ctx) in C:\Program Files (x86)\Jenkins\workspace\release_dotnet_master\Lib\WindowsRuntime\Blob\CloudBlob.cs:line 1206 at Microsoft.WindowsAzure.Storage.Core.Executor.Executor.d__4`1.MoveNext() in C:\Program Files (x86)\Jenkins\workspace\release_dotnet_master\Lib\WindowsRuntime\Core\Executor\Executor.cs:line 91

And finally my project.json :

  {
  "version": "1.0.0-*",

  "dependencies": {
    "Microsoft.Azure.DocumentDB.Core": "0.1.0-preview",
    "Microsoft.Azure.Storage.DataMovement": "0.4.1",
    "Microsoft.IdentityModel.Protocols": "2.0.0",
    "NETStandard.Library": "1.6.1",
    "MyProject.Data.Entities": "1.0.0-*",
    "MyProject.Settings": "1.0.0-*",
    "WindowsAzure.Storage": "7.2.1"
  },

  "frameworks": {   
    "netcoreapp1.0": {
      "imports": [
        "dnxcore50",
        "portable-net451+win8"
      ],
      "dependencies": {
        "Microsoft.NETCore.App": {
          "type": "platform",
          "version": "1.0.0-*"
        }
      }
    }
  }
}

Thanks for your help !

UPDATE 2 (working)

Thanks to Tom Sun, this is the working code

string storageConnectionString = "myStorageConnectionString";
     CloudStorageAccount account = CloudStorageAccount.Parse(storageConnectionString);
      CloudBlobClient blobClient = account.CreateCloudBlobClient();
      CloudBlobContainer blobContainer = blobClient.GetContainerReference("mycontainer");
      blobContainer.CreateIfNotExistsAsync().Wait();
      string sourcePath = @"C:\Tom\TestLargeFile.zip";
      CloudBlockBlob destBlob = blobContainer.GetBlockBlobReference("LargeFile.zip");
      // Setup the number of the concurrent operations
      TransferManager.Configurations.ParallelOperations = 64;
      // Setup the transfer context and track the upoload progress
      var context = new SingleTransferContext
      {
          ProgressHandler =
          new Progress<TransferStatus>(
               progress => { Console.WriteLine("Bytes uploaded: {0}", progress.BytesTransferred); })
       };
      // Upload a local blob
      TransferManager.UploadAsync(sourcePath, destBlob, null, context, CancellationToken.None).Wait();
      Console.WriteLine("Upload finished !");
      Console.ReadKey();

I also add

ShouldOverwriteCallback = (source, destination) =>
               {
                   return true;
               },

in the SingleTransferContext to overwrite a blob if it already exists.

3

3 Answers

3
votes

We can use  Azure Storage Data Movement Library easily to upload large files to Azure blob Storage. It works correctly for me, please have a try with the following code. More info about the Azure Storage Data Movement Library please refer to the document :

    string storageConnectionString = "storage connection string";
    CloudStorageAccount account = CloudStorageAccount.Parse(storageConnectionString);
    CloudBlobClient blobClient = account.CreateCloudBlobClient();
    CloudBlobContainer blobContainer = blobClient.GetContainerReference("mycontainer");
    blobContainer.CreateIfNotExists();
    string sourcePath = @"C:\Tom\TestLargeFile.zip";
    CloudBlockBlob destBlob = blobContainer.GetBlockBlobReference("LargeFile.zip");
    // Setup the number of the concurrent operations
    TransferManager.Configurations.ParallelOperations = 64;
    // Setup the transfer context and track the upoload progress

    var context = new SingleTransferContext();

    UploadOptions uploadOptions = new UploadOptions
    {
        DestinationAccessCondition = AccessCondition.GenerateIfExistsCondition()
    };
    context.ProgressHandler = new Progress<TransferStatus>(progress =>
    {
        Console.WriteLine("Bytes uploaded: {0}", progress.BytesTransferred);
    });
    // Upload a local blob
    TransferManager.UploadAsync(sourcePath, destBlob, uploadOptions,context, CancellationToken.None).Wait();

SDK info please refer to package.config file

<?xml version="1.0" encoding="utf-8"?>
<packages>
  <package id="Microsoft.Azure.KeyVault.Core" version="1.0.0" targetFramework="net452" />
  <package id="Microsoft.Azure.Storage.DataMovement" version="0.4.1" targetFramework="net452" />
  <package id="Microsoft.Data.Edm" version="5.6.4" targetFramework="net452" />
  <package id="Microsoft.Data.OData" version="5.6.4" targetFramework="net452" />
  <package id="Microsoft.Data.Services.Client" version="5.6.4" targetFramework="net452" />
  <package id="Microsoft.WindowsAzure.ConfigurationManager" version="1.8.0.0" targetFramework="net452" />
  <package id="Newtonsoft.Json" version="6.0.8" targetFramework="net452" />
  <package id="System.Spatial" version="5.6.4" targetFramework="net452" />
  <package id="WindowsAzure.Storage" version="7.2.1" targetFramework="net452" />
</packages>

Check the uploaded file from azure portal

enter image description here

Update:

For .net core project code:

     string storageConnectionString = "myStorageConnectionString";
     CloudStorageAccount account = CloudStorageAccount.Parse(storageConnectionString);
      CloudBlobClient blobClient = account.CreateCloudBlobClient();
      CloudBlobContainer blobContainer = blobClient.GetContainerReference("mycontainer");
      blobContainer.CreateIfNotExistsAsync().Wait();
      string sourcePath = @"C:\Tom\TestLargeFile.zip";
      CloudBlockBlob destBlob = blobContainer.GetBlockBlobReference("LargeFile.zip");
      // Setup the number of the concurrent operations
      TransferManager.Configurations.ParallelOperations = 64;
      // Setup the transfer context and track the upoload progress
      var context = new SingleTransferContext
      {
          ProgressHandler =
          new Progress<TransferStatus>(
               progress => { Console.WriteLine("Bytes uploaded: {0}", progress.BytesTransferred); })
       };
      // Upload a local blob
      TransferManager.UploadAsync(sourcePath, destBlob, null, context, CancellationToken.None).Wait();
      Console.WriteLine("Upload finished !");
      Console.ReadKey();

enter image description here

1
votes

We're actively looking into the issue in Azure Storage Client Library.

Please note that since UploadFromFileAsync() is not a reliable and efficient operation for a huge blob, I'd suggest you to consider following alternatives:

If you can accept command line tool, you can try AzCopy, which is able to transfer Azure Storage data in high performance and its transferring can be paused & resumed.

If you want to control the transferring jobs programmatically, please use Azure Storage Data Movement Library, which is the core of AzCopy.

0
votes

The original issue is fixed in version 8.0 of WindowsAzure.Storage.