0
votes

I'm working with an Azure solution where there is an output to a blob storage in ASA. I'm getting output files in a folder tree structure like this: yyyy/mm/dd/hh (e.g. 2017/10/26/07). Sometimes, files in the blob storage are saving in the hour folder after that hour is past and, as the result, files can be very big. Is there a way to limit the size of those files from ASA?

2

2 Answers

0
votes

There is no way to limit the size today, size limitation is based only on blob's limit. However ASA will create a new folder for every hour if your path is yyyy/mm/dd/hh though. Please note that this is based on System.Timestamp column, not wall clock time.

0
votes

Yes you limit the file size and create new file once the existing file size reaches the limit by using below length property.

namespace Microsoft.Azure.Management.DataLake.Store.Models {
   ...
    // Summary:
    //     Gets the number of bytes in a file.
    [JsonProperty(PropertyName = "length")]
    public long? Length { get; }
    ...
  }

Below is the example with scenario:

scenario If file size exceeds 256MB OR 268435456 bytes then create new file or use existing file.

Create a function and use this function to determine the file path, below is the sample code snippet for function.
Code Snippet:

public static async Task<string> GetFilePath(DataLakeStoreClient client, string path) {
   var createNewFile = false;
    ......  
            if (await client.GetFileSize(returnValue) >= 256 * 1024 * 1024)
                {
                    returnValue = GetFilePath(path);
                    createNewFile = true;                       
                }
    ......
}

public async Task<long?> GetFileSize(string filepath) {
       return (await this._client.FileSystem.GetFileStatusAsync(_connectionString.AccountName, path)).FileStatus.Length;
}