0
votes

I am trying to write a script in powershell that can loop through Excel sheets from a prepared Excel file and extract a range of values in each sheet, which I then pipe into the import-csv cmdlet. This is the first step in a larger script that acts on the csv files; I am trying to consolidate all the steps into 1 convenient script.

My problem is that I need:

  • the script to work without Excel installed (rules out COM object Excel.Application)

  • cannot install powershell modules (rules out the popular ImportExcel).

  • is usable on xlsx files (rules out jet 4.0 with excel object 8.0)

  • Doesn't require downloads/admin permissions to directories or has simple workarounds for this.

In short:

Is importing an excel sheet to CSV via a PowerShell script possible with only pre-installed Windows functionalities?

The next best thing would be minimal adjustments, such as bundling a small library with the script that can be easily referenced in the script (would Open XML SDK or EPPlus 4.5.3.3 fall into this category?).

Thank you.

1

1 Answers

1
votes

It's possible to work with the raw Excel data, but you will have to "reverse engineer" the format. I was able to get some useful data from a very simple sheet.

To test and play around with this create an empty folder and save an Excel document as Book1xlsx with some values like this:

| Name    | Value |
| adf     | 5     |
| fgfdg   | 4     |
| dfgdsfg | 3     |

Then place this script there, and see the result. If your data is any more advanced with this, you probably have to spend quite a bit of time figuring out how different types and sheets are named, and how to look them up.

unzip Book1.xlsx

[xml]$sheet = Get-Content "xl\worksheets\sheet1.xml"
[xml]$strings = Get-Content "xl\sharedStrings.xml"

$stringsTable = $strings.sst.si.t

$data = $sheet.worksheet.sheetData.row | % {
    # Each column for each row is in the "c" variable
    # (The ,@() is a hack to avoid powershell from turning everything into a single array)
    return ,@($_.c | % {
        # There is a "t" property that represents the type. 
        if ($_.t -like "s") {
            # "s" means a string. To get the actual content we need to look up in the strings xml
            return $stringsTable[$_.v]
        } elseif ($_.t -like "") {
            # Empty type means integer, we can return the value as is
            return $_.v
        }
    })
}

# Data will be a 2 dimensional array
# $data[0][0] will refer to A1
# $data[1][0] will refer to A2
# $data[0][1] will refer to B1
$data

Hopefully this will be enough to get you started.

Edit: Here is also some code to convert the 2 dimensional array into a PSObject you can use with Export-Csv.

$headers = $data[0]

$dataObject = $data | Select-Object -Skip 1 | % {
    $row = $_
    $index = 0
    $object = @{}
    foreach ($column in $row) {
        $object[$headers[$index]] += $column
        $index++
    }
    return [PSCustomObject]$object
}

$dataObject | Export-Csv ...