2
votes

I need to work with csv files in PowerShell that have a duplicate column header. The reasons which they have a duplicate column are beyond me. Such is life.

I want to use Import-Csv so that I can easily deal with the data, but since the duplicate column exists I get this error:

Import-Csv : The member "PROC STAT" is already present.
At C:\Users\MyName\Documents\SomeFolder\testScript1.ps1:10 char:9
+ $csv2 = Import-Csv $files[0].FullName
+         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : NotSpecified: (:) [Import-Csv], ExtendedTypeSystemException
    + FullyQualifiedErrorId : AlreadyPresentPSMemberInfoInternalCollectionAdd,Microsoft.PowerShell.Commands.ImportCsvCommand

I could manually fix the problem by going into every csv file and deleting the duplicate column. But this is not an option. There are hundreds of them, and the script needs to be run periodically. Ideally I am looking for a way to programatically remove that column (Import-Csv won't work) or programatically change the name of the column (so that I can then Import-Csv and delete it). Any suggestions?

My code to loop through all the files:

$files = Get-ChildItem "C:\Users\MyName\Documents\SomeFolder\Data" -Filter *.csv

foreach($file in $files) {
    $csv = Import-Csv $file.FullName
}
4
Do you know the header order in advance?Mathias R. Jessen
@MathiasR.Jessen yesThomas
I can use Get-Content $file.FullName -First 1 to get the first line (headers) from the file. But I am not sure how to modify/write out to a new file.Thomas

4 Answers

3
votes

You can specify custom header names with the Header parameter:

Import-Csv .\file.csv -Header header1,header2,header3

This will treat the original header line as a normal row, so skip the first output object with Select-Object:

Import-Csv .\file.csv -Header header1,header2,header3 |Select-Object -Skip 1
1
votes

you can load date with get-content and convert your data like this

Get-Content "C:\temp\test.csv" | ConvertFrom-String -Delimiter "," | select -Skip 1

short version:

gc "C:\temp\test.csv" | cfs -D "," | select -Skip 1

if you dont want rename auto the column you can rename manuelly like this

gc "C:\temp\test.csv" | cfs -D "," -PropertyNames head1, head2, head3 | select -Skip 1
0
votes

Here's an example of how to do it without needing to hard-code the column header names in the code (i.e., dynamically generate a generic header based on the number of columns in the CSV file):

$csvFile = "test.csv"

# Count columns in CSV file
$columnCount = (Get-Content $csvFile |
  Select-Object -Index 1,2 |
  ConvertFrom-Csv |
  Get-Member -MemberType NoteProperty |
  Measure-Object).Count

# Create list of generic property names (no duplicates)
$propertyNames = 1..$columnCount |
  ForEach-Object { "Property{0}" -f $_ }

# Get CSV file content, skip header line, and convert from CSV using generic header
Get-Content $csvFile |
  Select-Object -Skip 1 |
  ConvertFrom-Csv -Header $propertyNames

One caveat with this solution is that the CSV file must have at least two rows of data (not counting the header line).

0
votes

I ran into this a few times as well and wrote this as work around. It works with any csv even if all/multiple column names are the same.


    function Import-DuplicateHeaderCSV{
    <#
    # Synopsis
    Workaround function for the powershell error: "Import-Csv : The member "column_name" is already present." 
    This error is returned when attempting to use the Import-CSV cmdlet on a csv which has duplicate column names.
    
    # Description
    The headers are looped through, read in, and parsed into an array. 
    Duplicate headers are stored into a hash table e.g.{@columnName = numOccurences}.
    Multiple occurences of the header are supported by incrementing the value in the hashtable for each occurence. 
    The duplicate header is then inserted into the array with columnName_COPYnumOccruences. 
    Import-CSV is then used normally with the new column header array as the -header parameter.
    
    .PARAMETER $Path
    The full file path
    e.g. "C:\users\johndoe\desktop\myfile.csv"
    #>
    
        param(
            [Parameter(Mandatory=$true)] [string] $Path
        )
        
        $headerRow = Get-Content $Path | ConvertFrom-String -Delimiter "," | Select-Object -First 1 
        $objectSize = ($headerRow | Get-Member -MemberType NoteProperty | Measure-Object).Count
        $headers = @()
        $duplicates = @{}
    
        for ($i = 1; $i -le $objectSize; $i++){
            if ($headers -notcontains $headerRow."P$i"){
                $headers += $headerRow."P$i"
            }else{
                if ($duplicates.$($headerRow."P$i") -gt 0){
                    $duplicates.$($headerRow."P$i")++
                }else{
                    $duplicates.$($headerRow."P$i") = 1   
                }
                $header = $($headerRow."P$i")
                $header = $header + "_COPY"
                $header = $header + ($duplicates.$($headerRow."P$i"))
                $headers += $header
            }
        }
    
        $headerString = ""
        foreach ($item in $headers){$headerString += "'$item',"}
        $headerString = $headerString.Substring(0,$headerString.Length -1)
    
        $data = Invoke-Expression ("Import-Csv '$Path' " + "-Header " + $headerString)
        return $data
    }