0
votes

My Powershell script seems slow, when I run the below code in ISE, it keeps running, doesn't stop.

I am trying to write the list of subfolders in a folder(the folder path is in $scratchpart) to a text file. There are >30k subfolders

$limit = (Get-Date).AddDays(-15)
$path = "E:\Data\PathToScratch.txt"
$scratchpath = Get-Content $path -TotalCount 1

Get-ChildItem -Path $scratchpath -Recurse -Force | Where-Object { $_.PSIsContainer -and $_.CreationTime -lt $limit } | Add-Content C:\Data\eProposal\POC\ScratchContents.txt

Let me know if my approach is not optimal. Ultimately, I will read the text file, zip the subfolders for archival and delete them.

Thanks for your help in advance. I am new to PS, watched few videos on MVA

2
I think I would use background jobs (Start-Job) for all the top level folders (modulo a given maximum) and combine the results from that.Micky Balladelli

2 Answers

2
votes

Add-Content, Set-Content, and even Out-File are notoriously slow in PowerShell. This is because each call opens the file, writes to it, and closes the handle. It never does anything more intelligently than that.

That doesn't sound bad until you consider how pipelines work with Get-ChildItem (and Where-Object and Select-Object). It doesn't wait until it's completed before it begins passing objects into the pipeline. It starts passes objects as soon as the provider returns them. For a large result set, this means that the objects are still feeding in the pipeline long after several have finished processing. Generally speaking, this is great! It means the system will function more efficiently, and it's why stuff like this:

$x = Get-ChildItem;
$x | ForEach-Object { [...] };

Is significantly slower than stuff like this:

Get-ChildItem | ForEach-Object { [...] };

And it's why stuff like this appears to stall:

Get-ChildItem | Sort-Object Name | ForEach-Object { [...] };

The Sort-Object cmdlet needs to waits until it's received all pipeline objects before it sorts. It kind of has to to be able to sort. The sort itself is nearly instantaneous; it's just the cmdlet waiting until it has the full results.

The issue with Add-Content is that, well, it experiences the pipeline not as, "Here's a giant string to write once," but instead as, "Here's a string to write. Here's a string to write. Here's a string to write. Here's a string to write." You'll be sending content to Add-Content here line by line. Each line will instantiate a new call to Add-Content, requiring the file to open, write, and close. You'll likely see better performance if you assign the result of Get-ChildItem [...] | Where-Object [...] to a variable, and then write the entire variable to the file at once:

$limit = (Get-Date).AddDays(-15);
$path = "E:\Data\PathToScratch.txt";
$scratchpath = Get-Content $path -TotalCount 1;

$Results = Get-ChildItem -Path $scratchpath -Recurse -Force -Directory | `
    Where-Object{$_.CreationTime -lt $limit } | `
    Select-Object -ExpandPropery FullName;

Add-Content C:\Data\eProposal\POC\ScratchContents.txt -Value $Results;

However, you might be concerned about memory usage if your results are actually going to be extremely large. You can actually use System.IO.StreamWriter for this purpose, too. My process improved in speed by nearly two orders of magnitude (from 12 hours to 20 minutes) by switching to StreamWriter and also only calling StreamWriter when I had about 250 lines to write (that seemed to be the break-even point for StreamWriter's overhead). But I was parsing all ACLs for user home and group shares for about 10,000 users and nearly 10 TB of data. Your task might not be as large.

Here's a good blog explaining the issue.

0
votes

Do you have at least PowerShell 3.0? If you do you should be able to reduce the time by filtering out the files since you are returning those as well.

Get-ChildItem -Path $scratchpath -Recurse -Force -Directory | ...

Currently you are returning all files and folders then filtering out the files with $_.PSIsContainer which would be slower. So should end up with something like this

Get-ChildItem -Path $scratchpath -Recurse -Force -Directory | 
    Where-Object{$_.CreationTime -lt $limit } |
    Select-Object -ExpandPropery FullName | 
    Add-Content C:\Data\eProposal\POC\ScratchContents.txt