0
votes

i'm getting a limitation with cmdlet of Azure Datalake gen2:

https://docs.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-directory-file-acl-powershell

I'm using this cmdlet:

Get-AzDataLakeGen2ChildItem -Context $ctx -FileSystem $filesystemName -Path $dirname -Recurse -FetchProperty

to get all file and folder ACL from root but it has 5000 object limitation and it show this message when i run it to a folder with more than 5000 object:

enter image description here

basically, with that token i can continue from the last extracted(manually is crazy to do becouse maybe we have million file on datalake).

Is possible to avoid it or to loop in some way?

Here the script that i'm using(it works fine, i don't report all file ma only folder from root):

    $dir = Get-AzDataLakeGen2ChildItem -Context $ctx -FileSystem "datalake" -Recurse -FetchProperty 


$FileOutdtk = "C:\Temp\file.csv"
Clear-Content $FileOutdtk

Add-Content $FileOutdtk ('"Path"^"IsDirectory"^"Owner"^"DisplayName Owner"^"Owner Permissions"^"Group"^"DefaultScope"^"AccessControlType"^"EntityId"^"DisplayName Gruppo"^"PermissionsACL"')


foreach ($directory in $dir) {  



           if($directory.IsDirectory -eq $true){
            if($directory.Owner -imatch "superuser"){


           foreach ($ACLs in $directory.ACL){


                    if($ACLs.EntityId -eq $null ){

                    Add-Content $FileOutdtk ('"' + $directory.Path + '^' + $directory.IsDirectory + '^' + $directory.Owner + '^' + "" +  '^' + $directory.Permissions.Owner + '^' + $directory.Group + '^' + $ACLs.DefaultScope + '^' + $ACLs.accesscontroltype + '^' + $ACLs.EntityId + '^' + "" + '^' + $ACLs.Permissions + '"')

                    }
                    else{
                    $GruppiEntityId = Get-AzureADGroup -ObjectId $ACLs.EntityId

                    Add-Content $FileOutdtk ('"' + $directory.Path + '^' + $directory.IsDirectory + '^' + $directory.Owner + '^' + "" + '^' + $directory.Permissions.Owner + '^' + $directory.Group + '^' + $ACLs.DefaultScope + '^' + $ACLs.accesscontroltype + '^' + $ACLs.EntityId + '^' + $GruppiEntityId.displayname + '^' + $ACLs.Permissions + '"')



             }

        }

How i can loop that cmdlet to get up to 5000 object?

Thanks a lot

1

1 Answers

1
votes

If you want to list all items in one Azure data lake gen2 folder, please refer to the following script

$storageAccount = Get-AzStorageAccount -ResourceGroupName "<>" -AccountName "<>"
$ctx = $storageAccount.Context

$fileSystem="test"
$dirName="testFolder"
$Token = $Null
$Max=2000
do{
  $items=Get-AzDataLakeGen2ChildItem -Context $ctx -FileSystem $fileSystem  -Path $dirName -Recurse -FetchProperty  -ContinuationToken $Token -MaxCount $Max
  $items
  if($items.Length -le 0) { Break;}
  $Token = $items[$items.Count -1].ContinuationToken;

}
While ($Token -ne $Null)