0
votes

I am working on a script for PowerShell to import data from 2 separate Excel sheets, compare all properties of these sheets and export a new Excel worksheet with this output. The purpose of this script is to compare an updated excel sheet with a previous version and spot what has changed, what has been added and possibly even what has been removed.

I have been using the Import-Excel module to handle the first part, and Compare-Object for the data comparison. As far as I can tell Import-Excel imports the Excel data into a System.Object which is a hashtable of PSCustomObjects. Each PSCustomObject corresponds to a row in the Excel sheet.

I am running the following code:

$global:InputdataA = Import-Excel -Path $InputA -WorkSheetname $InputSheetA
$global:InputdataB = Import-Excel -Path $InputB -WorkSheetname $InputSheetB
$global:ReferenceObject = $InputdataA
$global:DifferenceObject = $InputdataB

$global:InputdataHeadersA = $InputdataA[0].psobject.properties.name
$global:InputdataHeadersB = $InputdataB[0].psobject.properties.name

$props = $InputdataHeadersA
$props += $InputdataHeadersB
$props = $props | Select -Unique

$compareResult = Compare-Object -ReferenceObject $ReferenceObject -DifferenceObject $DifferenceObject -Property $props -PassThru -CaseSensitive

I am using 2 excel files for testing:

Personnel_16_12_2018 - small2.xlsx (previous version) Personnel_28_11_2018 - small2.xlsx (new version with changes and additions)

The files can be downloaded here:

https://ufile.io/bmstu https://ufile.io/3z62x


The output I would like to see would be only the 7 entries that have been changed/added, with data from difference object (the new version of the excel sheet) This would represent the latest and "correct" data.

Currently I get an output from the compare-object, containing the 7 entries AND the 6 entries that have been changed from the reference object including side-indicator.

Is it possible to make the compare-object return only the changes or do I have to process the output afterwards?

1
Could you give some sample data that indicates the structure of your XLSX-files? If I compare 2 spreadsheets via Compare-Object and name each Headline in the -Property Argument, it works as expected - Razorfen
Hi Razorfen, I changed the code to include all headers in properties like this: $props = $InputdataHeadersA $props += $InputdataHeadersB $props = $props | Select -Unique $compareResult = Compare-Object -ReferenceObject $ReferenceObject -DifferenceObject $DifferenceObject -Property $props -PassThru -CaseSensitive #added array as properties Now it returns all changes, but it returns the changed rows twice: once for the new and changed data and once for the old data including side indicator. How do I change this to get only the new and changed items? - Rasmus Eilertsen
Could you edit your above question and insert some sample data (two sample spreadsheets) and your desired output? - Razorfen
Post has been updated - Rasmus Eilertsen

1 Answers

0
votes

Based on your additional information and the sample files, you could try something like this:

$oldFile = Import-Excel ".\personnel_28_11_2018---small2.xlsx"
$newFile = Import-Excel ".\personnel_16_12_2018---small2.xlsx"

$properties = "TRIAL_PK", "TRIALCOUNTRY_PK", "TRIALSSITE_PK", "ASSIGNMENT_LVL", "ROLE", "INT_EXT", "START_DATA", "END_DATE", "PERSONNELL_PK", "TITLE", "LAST_NAME", "FIRST_NAME", "ORGANIZATION_NAME"

$result = Compare-Object -ReferenceObject $oldFile -DifferenceObject $newFile -Property $properties -PassThru -CaseSensitive | Where-Object {$_.SideIndicator -eq "=>"}

$result | Select-Object $properties | Export-Excel ".\changed.xlsx"