1
votes
import csv
import sqlite3

fileName = 'australianpublicholidays.csv'
accessMode = 'r'

# Create a database in RAM
holidayDatabase = sqlite3.connect(':memory:')

# Create a cursor
c = holidayDatabase.cursor()

# Create a table
c.execute('''CREATE TABLE holidays 
(date text, holidayName text, information text, moreInformation text, applicableTo text)''')

# Read the file contents in to the table
with open(fileName, accessMode) as publicHolidays :
    listOfPublicHolidays = csv.reader(publicHolidays)

    for currentRow in listOfPublicHolidays :
        for currentEntry in currentRow :
            c.execute('INSERT INTO holidays VALUES (?, ?, ?, ?, ?)', currentEntry)

# Close the database
holidayDatabase.close()

The following line

    c.execute('INSERT INTO holidays VALUES (?, ?, ?, ?, ?)', currentEntry)

is causing this error

Incorrect number of bindings supplied. The current statement uses 5, and there are 4 supplied.

2
What do the contents of currentEntry look like? - ajshort
https://imgur.com/VEn8i0k Running in a virtual machine - C. P. Wagner

2 Answers

1
votes

currentRow is already a sequence. It's a list of all of the fields in the row. If you were to print out currentRow, you'd get output like this (assuming this is your data set https://data.gov.au/dataset/australian-holidays-machine-readable-dataset):

['Date', 'Holiday Name', 'Information', 'More Information', 'Applicable To']
['20150101', "New Year's Day", "New Year's Day is the first day of the calendaryear and is celebrated each January 1st", '', 'NAT']
['20150126', 'Australia Day', 'Always celebrated on 26 January', 'http://www.australiaday.org.au/', 'NAT']
['20150302', 'Labour Day', 'Always on a Monday, creating a long weekend. It celebrates the eight-hour working day, a victory for workers in the mid-late 19th century.',http://www.commerce.wa.gov.au/labour-relations/public-holidays-western-australia', 'WA']
...

When you do

for currentEntry in currentRow :
    c.execute('INSERT INTO holidays VALUES (?, ?, ?, ?, ?)', currentEntry)

You're actually getting a list of all of the characters in the first element in the list. Because you didn't skip the header row, you're actually getting a list of the characters in the word "Date". Which equals 4 characters and is causing the error:

sqlite3.ProgrammingError: Incorrect number of bindings supplied. The current sta
tement uses 5, and there are 4 supplied.

If you were to skip the header line using next(listOfPublicHolidays, None) as in:

with open(fileName, accessMode) as publicHolidays :
    listOfPublicHolidays = csv.reader(publicHolidays)
    next(listOfPublicHolidays, None)
    for currentRow in listOfPublicHolidays :
        for currentEntry in currentRow :
        c.execute('INSERT INTO holidays VALUES (?, ?, ?, ?, ?)', currentEntry)

you would get the following error message because currentEntry would be a list of the characters in "20150101", having a length of 8:

Traceback (most recent call last):
  File "holidaysorig.py", line 25, in <module>
    c.execute('INSERT INTO holidays VALUES (?, ?, ?, ?, ?)', tuple(currentEntry)
)
sqlite3.ProgrammingError: Incorrect number of bindings supplied. The current statement uses 5, and there are 8 supplied.

That's why it works (for the most part) when you remove the for currentEntry in currentRow : block and rewrite it as:

import csv
import sqlite3

fileName = 'australianpublicholidays.csv'
accessMode = 'r'

# Create a database in RAM
holidayDatabase = sqlite3.connect(':memory:')

# Create a cursor
c = holidayDatabase.cursor()

# Create a table
c.execute('''CREATE TABLE holidays 
(date text, holidayName text, information text, moreInformation text, applicableTo text)''')

# Read the file contents in to the table
with open(fileName, accessMode) as publicHolidays :
    listOfPublicHolidays = csv.reader(publicHolidays)
    for currentRow in listOfPublicHolidays :
        c.execute('INSERT INTO holidays VALUES (?, ?, ?, ?, ?)', currentRow)

# Close the database
holidayDatabase.close()

NOTE: On my machine, I got the following errors:

(holidays) C:\Users\eyounjo\projects\holidays>python holidaysorig.py
Traceback (most recent call last):
  File "holidaysorig.py", line 22, in <module>
    c.execute('INSERT INTO holidays VALUES (?, ?, ?, ?, ?)', currentRow)
sqlite3.ProgrammingError: You must not use 8-bit bytestrings unless you use a text_factory that can interpret 8-bit bytestrings (like text_factory = str). It is highly recommended that you instead just switch your application to Unicode strings.

So I rewrote your script as the following to deal with the above:

import csv, codecs
import sqlite3

# Encoding fix
def latin_1_encoder(unicode_csv_data):
    for line in unicode_csv_data:
        yield line.encode('latin-1')

fileName = 'australianpublicholidays.csv'
accessMode = 'r'

# Create a database in RAM
holidayDatabase = sqlite3.connect(':memory:')

# Create a cursor
c = holidayDatabase.cursor()

# Create a table
c.execute('''CREATE TABLE holidays 
(date text, holidayName text, information text, moreInformation text, applicableTo text)''')

# Read the file contents in to the table
# Encoding fix
with codecs.open(fileName, accessMode, encoding='latin-1') as publicHolidays :
    listOfPublicHolidays = csv.reader(latin_1_encoder(publicHolidays))
    # Skip the header row
    next(listOfPublicHolidays, None)
    entries = []
    for currentRow in listOfPublicHolidays:
        # Work-around for "You must not use 8-bit bytestrings" error
        entries.append(tuple([unicode(field, 'latin-1') for field in currentRow]))
    c.executemany('INSERT INTO holidays VALUES (?, ?, ?, ?, ?)', entries)

# Close the database
holidayDatabase.close()
0
votes

I have corrected the error by removing the nested for loop

Replaced the following

# Read the file contents in to the table
with open(fileName, accessMode) as publicHolidays :
    listOfPublicHolidays = csv.reader(publicHolidays)

    for currentRow in listOfPublicHolidays :
        for currentEntry in currentRow :
            c.execute('INSERT INTO holidays VALUES (?, ?, ?, ?, ?)', currentEntry)

With the following

with open(fileName, accessMode) as publicHolidays :
    listOfPublicHolidays = csv.reader(publicHolidays)

    for currentRow in listOfPublicHolidays :
            c.execute('INSERT INTO holidays VALUES (?, ?, ?, ?, ?)', currentRow)

However the cause of the error is still unclear to me.