I have a CSV file (one column/field only) with thousands of records in it.
I need a way in Powershell to search for a value using a few characters followed by a wildcard and, where found, then replace that value with a ".
I have searched around on how to do this but everyting I have found so far either doesn't cover CSV files or doesn't explain how I might be able to do the search using a wildcard.
Example of values in CSV file:
<#
RanDom.Texto 1.yellow [ Table - wood ] "gibberishcode1.moreRandomText11.xyz123+456"
[email protected] 2.blue [Chair - steel ] "gibberishcode2.moreRandomText222.xyz19283+4567+89
randomi.Textpel 3.green [ counter - granite] "gibberishcode3.moreRandomText3333.xyz17243+3210+987+654"
#>
You will note above that the only values in common across the records are the .xyz in each record.
I want to replace the .xyz (and everything that follows) with a " value.
E.g. Desired result as follows:
<#
RanDom.Texto 1.yellow [ Table - wood ] "gibberishcode1.moreRandomText11"
[email protected] 2.blue [Chair - steel ] "gibberishcode2.moreRandomText222"
Randomi.Textpel 3.green [ counter - granite] "gibberishcode3.moreRandomText3333"
#>
Here is some code I tried but it doesn't work in that it didn't replace the values (but it does successfuly export to a new csv file).
# Create function that gets the current file path (of where this script is located)
function Get-ScriptDirectory {Split-Path -parent $PSCommandPath}
# Create function that gets the current date and time in format of 1990-07-01_19h15m59
function Get-TimeStamp {return "{0:yyyy-MM-dd}_{0:HH}h{0:mm}m{0:ss}" -f (Get-Date)}
# Set current file path. Also used in both FOR loops below as primary source directory.
${sourceDirPath} = Get-ScriptDirectory
# Import CSV look-up file
${csvFile} = (Import-Csv -Path ${sourceDirPath}\SourceCSVFile.csv)
# for each row, replace the values of .xyz and all that follows with "
foreach(${row} in ${csvFile})
{
${row} = ${row} -replace '.xyz*','"'
}
# Set modified CSV's name and path
${newCSVFile} = ${sourceDirPath} + '\' + $(Get-TimeStamp) + '_SourceCSVFile_Modified.csv'
# export the modified CSV
${csvFile} | Export-Csv ${newCSVFile} -NoTypeInformation
I also tried this as an alternative but no luck either (i think this code below may only work for .txt files??) ...
((Get-Content -path C:\TEMP\TEST\SourceCSVFile.csv -Raw) -replace '.xyz'*,'"') | Export-Csv -Path C:\TEMP\TEST\ReplacementFile.csv
I'm new to Powershell and don't have a proper understanding of regex yet so please be gentle.
UPDATE and SOLUTION:
For those that are interested in my final solution ... I used the code provided by Thomas (Thank you!!) however my .csv file was left with some records that had a triple quote """ value at the end of the string.
As such I modified the code to use variables and execute a second pass of cleaning by replacing all triple quotation (e.g. """) values with a single quote value (e.g. ") and then piping the result to file.
# Create function that gets the current file path (of where this script is located and running from)
function Get-ScriptDirectory {Split-Path -parent $PSCommandPath}
# Set current file path
${sourceDirPath} = Get-ScriptDirectory
# Assign source .csv file name to variable
$origNameSource = 'AllNames.csv'
# Assign desired .csv file name post cleaning
$origNameCLEAN = 'AllNames_CLEAN.csv'
# First pass clean to replace .xyz* with " and assign result to tempCsvText variable
${tempCsvText} = ((Get-Content -Path ${sourceDirPath}\$origNameSource) | % {$_ -replace '\.xyz.*$', '"'})
# Second pass clean to replace """ with " and write result to a new .csv file
${tempCsvText} -replace '"""', '"' | Set-Content -Path ${sourceDirPath}\$origNameCLEAN
# Import records from new .csv file and remove duplicates by using Sort-Object * -Unique
${csvFile} = (Import-Csv -Path ${sourceDirPath}\$origNameCLEAN) | Sort-Object * -Unique
<pre>and</pre>tags, so they show up unaltered. - Theo