1
votes

the data in my csv file is like this:

081299289X,China Dolls,Lisa See,2014
0345498127,Starter for Ten,David Nicholls,2003
0061053716,Imajica,Clive Barker,1991
0553262149,Emily Climbs,L.M. Montgomery,1925

my import.py is like this:

import csv
import os

from sqlalchemy import create_engine
from sqlalchemy.orm import scoped_session, sessionmaker

engine = create_engine('postgres://pnrvnavoxvnlgw:....')
db = scoped_session(sessionmaker(bind=engine))

def main():
    f = open("books.csv")
    reader = csv.reader(f)
    for isbn, title, author, year in reader:
        db.execute(
            "INSERT INTO books (isbn, title, author, publication_year)
               VALUES (:isbn, :title, :author, :publication_year)",
            {"isbn": isbn, "title": title, "author": author, "publication_year": year}
        )
    db.commit()

if __name__ == "__main__":
    main()

For some reason I cannot seem to see what's wrong with this code. this is the error:

sqlalchemy.exc.DataError: (psycopg2.DataError) invalid input syntax for integer: "year"
LINE 1: ...publication_year) VALUES ('isbn', 'title', 'author', 'year')

Help?

2

2 Answers

5
votes

From the looks of it your CSV contains a header as the first row. Skip it with for example

next(reader, None)

before your for-loop.

0
votes

As answered on a similar question. I cannot see the header from Ilja Everilä's answer, but of course if there are headers, Everilä is completely right, you need to skip it (or perhaps make it into a map).

My answer propose a general design improvement, since ISBN should probably not be a number that said if there is a header, be aware that this change will hide that error until you inspect the data.

Change database schema (probably preferred)

Change the ISBN type to varchar(255)

Reasoning for preference: Is ISBN actually an integer? I.e. would you ever perform integer operations on it as addition? Or is is an ID that just happens to currently be formatted as numbers? Also ISBN is an ID format outside your control. Do you really want you application to be tightly coupled to the current formatting? What will happen if ISBN changes to include letters in the future?

Alternative, convert the string

Alternatively you can just convert isbn:

isbnAsNumber = int(isbn.strip().replace("-", ""))
db.execute("INSERT INTO books (isbn, title, author, year) VALUES (:isbn, :title, :author, :year)",
            {"isbn": isbnAsNumber, "title": title, "author": author, "year": year})

But I would consider this a dirty solution compared to changing the logical type, as suggested in the first solution.