I read an article called "Efficiently Import Large CSV into SQL Server using PowerShell C# or VB Net." I was using the PowerShell script from the article and I have 3 errors that I don't understand.
I have created a small version of the the csv file with only 21 entries. I have a SQL Server table I need to import data into from CSV file. The CSV file will be created nightly as it extracts from an Oracle database and imports it into SQL Server tables used by another program.
I took the script and used it against a 21 row dataset of 5 columns all of which are 10 characters wide. There can be no entry in the last column of data, which is why I included it in the data set. I used SSMS to create a table CMWBACCT in a database, CCBCMWB, I created on my machine together with the table I am trying to import data into. The CSV files in real life would be about 1.5 million rows, for each of the 3 tables with which I am working. The Program I used was the one from the article, but modified for just the one table.
The first error is at line 38 of the script:
$columns = (Get-Content $csvfile -First 1).Split($csvdelimiter)
where PowerShell does not like -First
Get-Content : A parameter cannot be found that matches parameter name 'First'. At C:\CMWBTST\Import-CSVtoSQLA.ps1:38 char:40 + $columns = (Get-Content $csvfile -First <<<< 1).Split($csvdelimiter) + CategoryInfo : InvalidArgument: (:) [Get-Content], ParameterBin dingException + FullyQualifiedErrorId : NamedParameterNotFound,Microsoft.PowerShell.Comm ands.GetContentCommand
The second error I understand even less it occurs on line 48:
$null = $datatable.Rows.Add($line.Split($csvdelimiter))
and occurs 21 times
Exception calling "Add" with "1" argument(s): "Input array is longer than the n umber of columns in this table." At C:\CMWBTST\Import-CSVtoSQLA.ps1:48 char:32 + $null = $datatable.Rows.Add <<<< ($line.Split($csvdelimiter)) + CategoryInfo : NotSpecified: (:) [], MethodInvocationException + FullyQualifiedErrorId : DotNetMethodException
The array contains five fields which matches the number of columns in the table so I am confused.
The final error the program balks at line 65:
$bulkcopy.Close(); $bulkcopy.Dispose()
with the error the I am missing the method Dispose
Method invocation failed because [System.Data.SqlClient.SqlBulkCopy] doesn't contain a method named 'Dispose'. At C:\CMWBTST\Import-CSVtoSQLA.ps1:65 char:37 + $bulkcopy.Close(); $bulkcopy.Dispose <<<< () + CategoryInfo : InvalidOperation: (Dispose:String) [], RuntimeEx ception + FullyQualifiedErrorId : MethodNotFound
Where is this method supposed to be defined?
Below is the Script I used:
####################################################
# #
# PowerShell CSV to SQL Import Script #
# #
####################################################
# Database variables
$sqlserver = "EMRICHRT3400"
$database = "CCBCMWB"
$table = "CMWBACCT"
# CSV variables
$csvfile = "C:\CMWBTST\CMWBACCT21.csv"
$csvdelimiter = ","
$FirstRowColumnNames = $false
################### No need to modify anything below ###################
Write-Host "Script started..."
$elapsed = [System.Diagnostics.Stopwatch]::StartNew()
[void][Reflection.Assembly]::LoadWithPartialName("System.Data")
[void][Reflection.Assembly]::LoadWithPartialName("System.Data.SqlClient")
# 50k worked fastest and kept memory usage to a minimum
$batchsize = 50000
# Build the sqlbulkcopy connection, and set the timeout to infinite
$connectionstring = "Data Source=$sqlserver;Integrated Security=true;Initial Catalog=$database;"
$bulkcopy = New-Object Data.SqlClient.SqlBulkCopy($connectionstring, [System.Data.SqlClient.SqlBulkCopyOptions]::TableLock)
$bulkcopy.DestinationTableName = $table
$bulkcopy.bulkcopyTimeout = 0
$bulkcopy.batchsize = $batchsize
# Create the datatable, and autogenerate the columns.
$datatable = New-Object System.Data.DataTable
# Open the text file from disk
$reader = New-Object System.IO.StreamReader($csvfile)
$columns = (Get-Content $csvfile -First 1).Split($csvdelimiter)
if ($FirstRowColumnNames -eq $true) { $null = $reader.readLine() }
foreach ($column in $columns) {
$null = $datatable.Columns.Add()
}
# Read in the data, line by line
while (($line = $reader.ReadLine()) -ne $null) {
$null = $datatable.Rows.Add($line.Split($csvdelimiter))
$i++; if (($i % $batchsize) -eq 0) {
$bulkcopy.WriteToServer($datatable)
Write-Host "$i rows have been inserted in $($elapsed.Elapsed.ToString())."
$datatable.Clear()
}
}
# Add in all the remaining rows since the last clear
if($datatable.Rows.Count -gt 0) {
$bulkcopy.WriteToServer($datatable)
$datatable.Clear()
}
# Clean Up
$reader.Close(); $reader.Dispose()
$bulkcopy.Close(); $bulkcopy.Dispose()
$datatable.Dispose()
Write-Host "Script complete. $i rows have been inserted into the database."
Write-Host "Total Elapsed Time: $($elapsed.Elapsed.ToString())"
# Sometimes the Garbage Collector takes too long to clear the huge datatable.
[System.GC]::Collect()
and the CSV(text) file used
0000050590,1390457725,2013-01-02,2016-03-07,2016-06-06
0000100491,8156952728,2008-12-16,2016-04-01,2016-07-01
0000120293,0000120000,2006-11-15,2016-02-18,2016-05-19
0000220299,0000220000,2006-10-11,2016-04-15,2016-07-15
0000340706,0000340000,2009-03-12,2016-02-24,2016-05-25
0000420610,9760303504,2012-05-16,2016-04-15,2016-07-15
0000500613,0000500000,2006-12-06,2016-03-01,2016-06-03
0000740524,0000740000,2006-10-18,2016-04-25,2016-07-25
0001030634,0001030000,2006-11-16,2016-02-18,2016-05-19
0001120239,0001120000,2006-12-14,2016-03-17,2016-06-17
0001150542,0001150000,2006-11-16,2016-02-18,2016-05-19
0001220144,0001220000,2006-10-10,2016-04-15,2016-07-15
0001240146,2947199958,2011-09-26,2016-04-07,2016-07-08
0001520257,7724424991,2012-12-17,2016-04-15,2016-07-15
0001530858,0001530000,2006-12-20,2016-03-22,2016-06-27
0001620362,0001620000,2006-10-16,2016-04-20,2016-07-20
0001700965,0001700000,2006-12-04,2016-03-03,2016-06-08
0001730768,0001730000,2006-10-10,2016-04-07,2016-07-07
0001910075,6494797239,2016-05-17,,
0001920876,0001920000,2006-10-31,2016-05-03,2016-08-03
0002140587,5733138981,2013-02-01,2016-04-14,2016-07-14
Any help will be appreciated
Get-Content
doesn't have a parameter-First
so the script is just broken. Probably that bit should sayGet-Content | Select -First 1
to take the first line of the file. After reading the columns failed, it's no surprise that it throws errors about the number of columns in the table. Try making that change and re-run, see what new errors it has? – TessellatingHeckler