2
votes

I am using Pandas Read Sql in Python for executing sql-statements in IBM DB2 and saving the response in pandas dataframes. Now I am trying to execute a sql-statement containing a non ASCII character, a letter from the Swedish alphabet: 'Å' (others are Å, Ä and Ö) but I am getting this error:

"DatabaseError: Execution failed on sql 'SELECT * FROM DATA_CONFIG WHERE TAG_NAME='Å'': ibm_db_dbi::Error: Error occure during processing of statement"

This is my code:

import ibm_db
import ibm_db_dbi
import pandas as pd

def sqlToFrame(sql): # Function for creating pandas dataframes from SQL-statements
    con = ibm_db.connect(connection_string, "", "")
    conn = ibm_db_dbi.Connection(con)
    return pd.read_sql(sql, conn)

df = sqlToFrame("SELECT * FROM DATA_CONFIG WHERE TAG_NAME='Å'")

I've tried executing the statement in the IDE in the IBM DB2 interface which works perfectly fine. Therefore I am figuring that the problem might be connected to how I establish the connection/the DB2 driver pandas uses. I have tried finding a way to set an encoding but can't find anything. How can I solve this? I also know this is possible because another package that builds upon ibm_db accept these characters. The characters are a from the ISO-8859-1 series

1
Which version of Python? Which bitness? Which version of ibm_db, and ibm_db_dbi ? Also what is the encoding of your target Db2 database? - mao
Python2.7 with the latest version of those packages. The target DB2 database uses the default encoding @mao - danielo
Is it 32-bit python , or 64-bit python? What operating-system runs the Db2-server? - mao
Suggest you try with 64-bit Python 3.5 or higher. Seems ibm_db.c barfs when PyUnicode_FromObject() returns null or Py_None. - mao
Can you carefully check your attempted SQL as there is no closing single quote after the character from posted error (NOT posted query)? I passed your special character to a DB2 connection with Python 3.5 and no error was raised. - Parfait

1 Answers

1
votes

Consider parameterizing your query using the params argument of pandas.read_sql and pass the accented character 'Å' with u'' prefix to bind value to the unquoted ? placeholder in SQL query. Do note: params requires a sequence and so below passes a tuple of one item.

Unlike Python 2.x, all strings in Python 3.x are Unicode strings and so accented literals (non-ascii) do not need explicit decoding with u'...'. Hence, why I cannot reproduce your issue in my Python 3.5 running a DB2 SQL query with accented characters.

import pandas as pd
...

# Function for creating pandas dataframes from SQL-statements
def sqlToFrame(sql): 
    db = ibm_db.connect(connection_string, "", "")
    con = ibm_db_dbi.Connection(db)

    return pd.read_sql(sql, con, params = (u'Å',))


df = sqlToFrame("SELECT * FROM DATA_CONFIG WHERE TAG_NAME = ?")