1
votes

I am looking for a way to get data about overall size of all data stored in data lake (azure data lake gen2 2 - adls2). Does anyone know how to obtain such information about data lake size (how much data is stored)? Tried to find appropriate API but didn't find anything so far. Will be thankful for any tips.

1

1 Answers

1
votes

If you want to get the size of all data stored in data lake gen2(not include the File, Table, Queue storage), you could use this Metrics - List REST API with metricnames=BlobCapacity, specify the timespan with the latest one hour, e.g. now is 2019-10-14T05:48:03Z, just use timespan=2019-10-14T04:47:03Z/2019-10-14T05:47:03Z, it works fine on my side.

Sample:

GET https://management.azure.com/subscriptions/<subscription-id>/resourceGroups/<resource-group-name>/providers/Microsoft.Storage/storageAccounts/<storageaccount-name>/blobServices/default/providers/microsoft.insights/metrics?timespan=2019-10-14T04:47:03Z/2019-10-14T05:47:03Z&metricnames=BlobCapacity&api-version=2018-01-01

Response:

 { 
   "cost":0,
   "timespan":"2019-10-14T04:47:03Z/2019-10-14T05:47:03Z",
   "interval":"PT1H",
   "value":[ 
      { 
         "id":"/subscriptions/xxxxxxx/resourceGroups/xxxxxxx/providers/Microsoft.Storage/storageAccounts/joygen2/blobServices/default/providers/Microsoft.Insights/metrics/BlobCapacity",
         "type":"Microsoft.Insights/metrics",
         "name":{ 
            "value":"BlobCapacity",
            "localizedValue":"Blob Capacity"
         },
         "displayDescription":"The amount of storage used by the storage account’s Blob service in bytes.",
         "unit":"Bytes",
         "timeseries":[ 
            { 
               "metadatavalues":[ 

               ],
               "data":[ 
                  { 
                     "timeStamp":"2019-10-14T04:47:00Z",
                     "average":44710.0
                  }
               ]
            }
         ]
      }
   ],
   "namespace":"Microsoft.Storage/storageAccounts/blobServices",
   "resourceregion":"eastus"
}

Update:

If you want to get the size all data including the File, Table, Queue storage, just use the UsedCapacity metric name.

Sample:

GET https://management.azure.com/subscriptions/<subscription-id>/resourceGroups/<resource-group-name>/providers/Microsoft.Storage/storageAccounts/<storageaccount-name>/providers/microsoft.insights/metrics?timespan=2019-10-14T04:47:03Z/2019-10-14T05:47:03Z&metricnames=UsedCapacity&api-version=2018-01-01

Response:

{ 
   "cost":0,
   "timespan":"2019-10-14T04:47:03Z/2019-10-14T05:47:03Z",
   "interval":"PT1H",
   "value":[ 
      { 
         "id":"/subscriptions/xxxxx/resourceGroups/xxxxx/providers/Microsoft.Storage/storageAccounts/xxxxx/providers/Microsoft.Insights/metrics/UsedCapacity",
         "type":"Microsoft.Insights/metrics",
         "name":{ 
            "value":"UsedCapacity",
            "localizedValue":"Used capacity"
         },
         "displayDescription":"Account used capacity",
         "unit":"Bytes",
         "timeseries":[ 
            { 
               "metadatavalues":[ 

               ],
               "data":[ 
                  { 
                     "timeStamp":"2019-10-14T04:47:00Z",
                     "average":2559131.0
                  }
               ]
            }
         ]
      }
   ],
   "namespace":"Microsoft.Storage/storageAccounts",
   "resourceregion":"eastus"
}