I am trying to recursively trawl through a directory structure looking for word docs and then extracting hyperlinks. When the code executes the output is as follows:
processing 2 docs
File Name Hyperlink
--------- ---------
C:\temp\doc1.docx
C:\temp\doc1.docx
C:\temp\folder\doc2.docx
C:\temp\folder\doc2.docx
Nothing I have tried seems to work. I have tried using:
- "Hyperlink" = $_Address
- "Hyperlink" = $thisDoc.Address
- "Hyperlink" = $thisDoc.Hyperlink.Address
Clear-Host
$parentFolder = "C:\temp"
$ourDocs = Get-ChildItem -Recurse -LiteralPath $parentFolder -file -include *.doc*
"processing {0} docs" -f $ourDocs.Count
$word = New-Object -ComObject word.application
$word.Visible = $false
$word.ScreenUpdating = $false
$array = New-Object System.Collections.ArrayList
$ourDocs | ForEach-Object{
$thisDoc = $word.Documents.Open($_.FullName)
$thisDoc.Hyperlinks | ForEach-Object {
$array.Add([pscustomobject]@{
"File Name" = $thisDoc.FullName
"Hyperlink" = $_Address}) | Out-null
}
$thisDoc.Close()
}
$Word.Quit()
$array
# cleanup com objects
[System.Runtime.Interopservices.Marshal]::ReleaseComObject($word) | Out-Null
[System.GC]::Collect()
[System.GC]::WaitForPendingFinalizers()