0
votes

I read an article called "Efficiently Import Large CSV into SQL Server using PowerShell C# or VB Net." I was using the PowerShell script from the article and I have 3 errors that I don't understand.

I have created a small version of the the csv file with only 21 entries. I have a SQL Server table I need to import data into from CSV file. The CSV file will be created nightly as it extracts from an Oracle database and imports it into SQL Server tables used by another program.

I took the script and used it against a 21 row dataset of 5 columns all of which are 10 characters wide. There can be no entry in the last column of data, which is why I included it in the data set. I used SSMS to create a table CMWBACCT in a database, CCBCMWB, I created on my machine together with the table I am trying to import data into. The CSV files in real life would be about 1.5 million rows, for each of the 3 tables with which I am working. The Program I used was the one from the article, but modified for just the one table.

The first error is at line 38 of the script:

$columns = (Get-Content $csvfile -First 1).Split($csvdelimiter) 

where PowerShell does not like -First

Get-Content : A parameter cannot be found that matches parameter name 'First'. At C:\CMWBTST\Import-CSVtoSQLA.ps1:38 char:40 + $columns = (Get-Content $csvfile -First <<<< 1).Split($csvdelimiter) + CategoryInfo : InvalidArgument: (:) [Get-Content], ParameterBin dingException + FullyQualifiedErrorId : NamedParameterNotFound,Microsoft.PowerShell.Comm ands.GetContentCommand

The second error I understand even less it occurs on line 48:

$null = $datatable.Rows.Add($line.Split($csvdelimiter)) 

and occurs 21 times

Exception calling "Add" with "1" argument(s): "Input array is longer than the n umber of columns in this table." At C:\CMWBTST\Import-CSVtoSQLA.ps1:48 char:32 + $null = $datatable.Rows.Add <<<< ($line.Split($csvdelimiter)) + CategoryInfo : NotSpecified: (:) [], MethodInvocationException + FullyQualifiedErrorId : DotNetMethodException

The array contains five fields which matches the number of columns in the table so I am confused.

The final error the program balks at line 65:

$bulkcopy.Close(); $bulkcopy.Dispose() 

with the error the I am missing the method Dispose

Method invocation failed because [System.Data.SqlClient.SqlBulkCopy] doesn't contain a method named 'Dispose'. At C:\CMWBTST\Import-CSVtoSQLA.ps1:65 char:37 + $bulkcopy.Close(); $bulkcopy.Dispose <<<< () + CategoryInfo : InvalidOperation: (Dispose:String) [], RuntimeEx ception + FullyQualifiedErrorId : MethodNotFound

Where is this method supposed to be defined?

Below is the Script I used:

#################################################### 
#                                                  #
# PowerShell CSV to SQL Import Script              #
#                                                  #
#################################################### 

# Database variables 
$sqlserver = "EMRICHRT3400" 
$database = "CCBCMWB" 
$table = "CMWBACCT" 

# CSV variables 
$csvfile = "C:\CMWBTST\CMWBACCT21.csv" 
$csvdelimiter = "," 
$FirstRowColumnNames = $false 

################### No need to modify anything below ################### 
Write-Host "Script started..." 
$elapsed = [System.Diagnostics.Stopwatch]::StartNew()  
[void][Reflection.Assembly]::LoadWithPartialName("System.Data") 
[void][Reflection.Assembly]::LoadWithPartialName("System.Data.SqlClient") 

# 50k worked fastest and kept memory usage to a minimum 
$batchsize = 50000 

# Build the sqlbulkcopy connection, and set the timeout to infinite 
$connectionstring = "Data Source=$sqlserver;Integrated Security=true;Initial Catalog=$database;" 
$bulkcopy = New-Object Data.SqlClient.SqlBulkCopy($connectionstring, [System.Data.SqlClient.SqlBulkCopyOptions]::TableLock) 
$bulkcopy.DestinationTableName = $table 
$bulkcopy.bulkcopyTimeout = 0 
$bulkcopy.batchsize = $batchsize 

# Create the datatable, and autogenerate the columns. 
$datatable = New-Object System.Data.DataTable 

# Open the text file from disk 
$reader = New-Object System.IO.StreamReader($csvfile) 
$columns = (Get-Content $csvfile -First 1).Split($csvdelimiter) 
if ($FirstRowColumnNames -eq $true) { $null = $reader.readLine() } 

foreach ($column in $columns) {  
    $null = $datatable.Columns.Add() 
} 

# Read in the data, line by line 
while (($line = $reader.ReadLine()) -ne $null)  { 

    $null = $datatable.Rows.Add($line.Split($csvdelimiter)) 

    $i++; if (($i % $batchsize) -eq 0) {  
        $bulkcopy.WriteToServer($datatable)  
        Write-Host "$i rows have been inserted in $($elapsed.Elapsed.ToString())." 
        $datatable.Clear()  
    }  
}  

# Add in all the remaining rows since the last clear 
if($datatable.Rows.Count -gt 0) { 
    $bulkcopy.WriteToServer($datatable) 
    $datatable.Clear() 
} 

# Clean Up 
$reader.Close(); $reader.Dispose() 
$bulkcopy.Close(); $bulkcopy.Dispose() 
$datatable.Dispose() 

Write-Host "Script complete. $i rows have been inserted into the database." 
Write-Host "Total Elapsed Time: $($elapsed.Elapsed.ToString())" 
# Sometimes the Garbage Collector takes too long to clear the huge datatable. 
[System.GC]::Collect()

and the CSV(text) file used

0000050590,1390457725,2013-01-02,2016-03-07,2016-06-06
0000100491,8156952728,2008-12-16,2016-04-01,2016-07-01
0000120293,0000120000,2006-11-15,2016-02-18,2016-05-19
0000220299,0000220000,2006-10-11,2016-04-15,2016-07-15
0000340706,0000340000,2009-03-12,2016-02-24,2016-05-25
0000420610,9760303504,2012-05-16,2016-04-15,2016-07-15
0000500613,0000500000,2006-12-06,2016-03-01,2016-06-03
0000740524,0000740000,2006-10-18,2016-04-25,2016-07-25
0001030634,0001030000,2006-11-16,2016-02-18,2016-05-19
0001120239,0001120000,2006-12-14,2016-03-17,2016-06-17
0001150542,0001150000,2006-11-16,2016-02-18,2016-05-19
0001220144,0001220000,2006-10-10,2016-04-15,2016-07-15
0001240146,2947199958,2011-09-26,2016-04-07,2016-07-08
0001520257,7724424991,2012-12-17,2016-04-15,2016-07-15
0001530858,0001530000,2006-12-20,2016-03-22,2016-06-27
0001620362,0001620000,2006-10-16,2016-04-20,2016-07-20
0001700965,0001700000,2006-12-04,2016-03-03,2016-06-08
0001730768,0001730000,2006-10-10,2016-04-07,2016-07-07
0001910075,6494797239,2016-05-17,,
0001920876,0001920000,2006-10-31,2016-05-03,2016-08-03
0002140587,5733138981,2013-02-01,2016-04-14,2016-07-14

Any help will be appreciated

1
Get-Content doesn't have a parameter -First so the script is just broken. Probably that bit should say Get-Content | Select -First 1 to take the first line of the file. After reading the columns failed, it's no surprise that it throws errors about the number of columns in the table. Try making that change and re-run, see what new errors it has?TessellatingHeckler

1 Answers

-1
votes

I recommend to use Import-Csv PowerShell command to import data from CSV file since it is official approach to process csv format file. And syntax of the cmdlet is very simple.

Import-Csv -Path target.csv -Delimiter ,

For import data into database, you can make use of SQL Server PowerShell module instead of ADO.NET in PowerShell.

Invoke-Sqlcmd -Query "<sql statements>"

You can read this post How to use SQL Server PowerShell Module to import data from CSV file to download sample