1
votes

Sorry for the millionth question about this, but I've read so much about the topic and still don't get this error fixed (newbie to all of this). I'm trying to display the content of a postgres table on a website with flask (using Ubuntu 16.04/python 2.7.12). There are non-ascii characters in the table ('ü' in this case) and the result is a UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 2: ordinal not in range(128).

This is what my init.py looks like:

        #-*- coding: utf-8 -*-

from flask import Blueprint, render_template
import psycopg2
from .forms import Form
from datetime import datetime
from .table import Item, ItemTable

test = Blueprint('test', __name__)

def init_test(app):
    app.register_blueprint(test)

def createTable(cur):
    cmd = "select * from table1 order by start desc;"
    cur.execute(cmd)
    queryResult = cur.fetchall()
    items = []
    table = 'table could not be read'
    if queryResult is not None:         
        for row in range(0, len(queryResult)):
        items.append(Item(queryResult[row][0], queryResult[row][1].strftime("%d.%m.%Y"), queryResult[row][2].strftime("%d.%m.%Y"), \
                          queryResult[row][1].strftime("%H:%M"), queryResult[row][2].strftime("%H:%M"), \
                          queryResult[row][3], queryResult[row][4], queryResult[row][5], queryResult[row][6]))
        table = ItemTable(items)
    return table


@test.route('/test')
def index():
    dbcon = psycopg2.connect("dbname=testdb user=postgres host=localhost")
    cur = dbcon.cursor()
    table = createTable(cur)
    cur.close()
    return render_template('test_index.html', table=table)

And part of the html-file:

{% extends "layout.html" %}
{% block head %}Title{% endblock %}
{% block body %}
<script type="text/javascript" src="{{ url_for('static', filename='js/bootstrap.js') }}"></script>
<link rel="stylesheet" type="text/css" href="{{ url_for('static', filename='css/custom.css') }}">
<div class="row" id="testid">
    {{table}}
</div>
{% endblock %}{#
Local Variables:
coding: utf-8
End: #}

The problem is in queryResult[row][6] which is the only row in the table with strings, the rest is integers. The encoding of the postgres database is utf-8. The type of queryResult[row][6] returns type 'str'. What I read here is that the string should be encoded in utf-8, as that is the encoding of the database client. Well, that doesn't seem to work!? Then I added the line

psycopg2.extensions.register_type(psycopg2.extensions.UNICODE)

to force the result to be unicode (type of queryResult[row][6] returned type 'unicode'), because as was recommended here, I tried to stick to unicode everywhere. Well that resulted in a UnicodeEncodeError: 'ascii' codec can't encode character u'\xfc' in position 2: ordinal not in range(128). Then I thought, maybe something went wrong with converting to string (bytes) before and I tried to do it myself then with writing

queryResult[row][6].encode('utf-8', 'replace')

which led to an UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 2: ordinal not in range(128). Didn't even work with 'ignore' instead of 'replace'. What is going on here? I checked if the render_template() has a problem with unicode by creating and passing a variable v=u'ü', but that was no problem and was displayed correctly. Yeah, I read the usual recommended stuff like nedbatchelder.com/text/unipain.html and Unicode Demystified, but that didn't help me solve my problem here, I'm obviously missing something.

Here is a traceback of the first UnicodeDecodeError:

File "/home/name/Desktop/testFlask/venv/lib/python2.7/site-packages/flask/app.py", line 2000, in __call__
return self.wsgi_app(environ, start_response)
File "/home/name/Desktop/testFlask/venv/lib/python2.7/site-packages/flask/app.py", line 1991, in wsgi_app
response = self.make_response(self.handle_exception(e))
File "/home/name/Desktop/testFlask/venv/lib/python2.7/site-packages/flask/app.py", line 1567, in handle_exception
reraise(exc_type, exc_value, tb)
File "/home/name/Desktop/testFlask/venv/lib/python2.7/site-packages/flask/app.py", line 1988, in wsgi_app
response = self.full_dispatch_request()
File "/home/name/Desktop/testFlask/venv/lib/python2.7/site-packages/flask/app.py", line 1641, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/home/name/Desktop/testFlask/venv/lib/python2.7/site-packages/flask/app.py", line 1544, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "/home/name/Desktop/testFlask/venv/lib/python2.7/site-packages/flask/app.py", line 1639, in full_dispatch_request
rv = self.dispatch_request()
File "/home/name/Desktop/testFlask/venv/lib/python2.7/site-packages/flask/app.py", line 1625, in dispatch_request
return self.view_functions[rule.endpoint](**req.view_args)
File "/home/name/Desktop/testFlask/app/test/__init__.py", line 95, in index
return render_template('test_index.html', table=table) #, var=var
File "/home/name/Desktop/testFlask/venv/lib/python2.7/site-packages/flask/templating.py", line 134, in render_template
context, ctx.app)
File "/home/name/Desktop/testFlask/venv/lib/python2.7/site-packages/flask/templating.py", line 116, in _render
rv = template.render(context)
File "/home/name/Desktop/testFlask/venv/lib/python2.7/site-packages/jinja2/environment.py", line 989, in render
return self.environment.handle_exception(exc_info, True)
File "/home/name/Desktop/testFlask/venv/lib/python2.7/site-packages/jinja2/environment.py", line 754, in handle_exception
reraise(exc_type, exc_value, tb)
File "/home/name/Desktop/testFlask/app/templates/test_index.html", line 1, in top-level template code
{% extends "layout.html" %}
File "/home/name/Desktop/testFlask/app/templates/layout.html", line 40, in top-level template code
{% block body %}{% endblock %}
File "/home/name/Desktop/testFlask/app/templates/test_index.html", line 7, in block "body"
{{table}}
File "/home/name/Desktop/testFlask/venv/lib/python2.7/site-packages/flask_table/table.py", line 86, in __html__
tbody = self.tbody()
File "/home/name/Desktop/testFlask/venv/lib/python2.7/site-packages/flask_table/table.py", line 103, in tbody
out = [self.tr(item) for item in self.items]
File "/home/name/Desktop/testFlask/venv/lib/python2.7/site-packages/flask_table/table.py", line 120, in tr
''.join(c.td(item, attr) for attr, c in self._cols.items()
File "/home/name/Desktop/testFlask/venv/lib/python2.7/site-packages/flask_table/table.py", line 121, in <genexpr>
if c.show))
File "/home/name/Desktop/testFlask/app/test/table.py", line 7, in td
self.td_contents(item, self.get_attr_list(attr)))
File "/home/name/Desktop/testFlask/venv/lib/python2.7/site-packages/flask_table/columns.py", line 99, in td_contents
return self.td_format(self.from_attr_list(item, attr_list))
File "/home/name/Desktop/testFlask/venv/lib/python2.7/site-packages/flask_table/columns.py", line 114, in td_format
return Markup.escape(content)
File "/home/name/Desktop/testFlask/venv/lib/python2.7/site-packages/markupsafe/__init__.py", line 165, in escape
rv = escape(s)

Any help is greatly appreciated...

3

3 Answers

0
votes

since in Python 2 bytecode is not enforced, one can get confused with them. Encoding and Decoding works as far as i know from string to bytecode and reverse. So if your resultset is a string, there should be no need to encode it again. If you get wrong representations for special characters like "§", i would try something like this:

repr(queryResult[row][6])).

Does that work?

0
votes

See: https://wiki.python.org/moin/UnicodeEncodeError

The encoding of the postgres database is utf-8. The type of queryResult[row][6] returns type 'str'.

You've got it right so far. Remember, in Python 2.7, a str is a string of bytes. So you've got a string of bytes from the database, that probably looks like 'gl\xc3\xbce' ('glüe').

What happens next is that some part of the program is calling .decode on your string, but using the default 'ascii' codec. It's probably some part of the Item() API that needs the string as a unicode object, or maybe Flask itself. Either way, you need to call .decode yourself on your string, since you know that it's actually in utf-8:

col_6 = queryResult[row][6].decode('utf-8')
Item(..., ..., col_6, ...)

Then you will provide all the downstream APIs with a unicode which is apparently what they want.

The way I remember it is this: Unicode is a an abstraction, where everything is represented as "code points". If we want to create real bytes that we can print on a screen or send as an HTML file, we need to ENcode to bytes. If you have some bytes, they could mean any letters, who knows? You need to DEcode the mysterious bytes in order to get Unicode.

Hope this helps.

0
votes

So I finally found a solution after sticking to unicode everywhere with the help of

psycopg2.extensions.register_type(psycopg2.extensions.UNICODE)

The error led me then to my own class customCol(Col) that I had written:

class customCol(Col):
def td(self, item, attr):
    return '<td><div id="beschrCol">{}</div></td>'.format(
        self.td_contents(item, self.get_attr_list(attr)))

The problem here was the .format() call, and after reading this, I just turned the string in front of .format to unicode and the problem was solved,

def td(self, item, attr):
    return u'<td><div id="beschrCol">{}</div></td>'.format...

Works with passing strings to Item(), too, but then I had to put

queryResult[row][6].decode('utf-8')

in the Item() call.