0
votes

I have a mySQL table which is set to CHARACTER SET utf8mb4 and a column x which has CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci, and I can run an SQL command directly on the database which inserts a 4-byte unicode character, like

INSERT INTO mytable (x) VALUES ('????');

but when I run the following in web2py, I get a different entry, which looks like ????

sql = u"INSERT INTO mytable (x) VALUES (%s)"
db.executesql(sql, (u'????',))

Is there something I need to set in web2py somewhere to tell it to pass the unicode characters through without alteration?

Addendum: the same ???? entry occurs when I use the DAL too, as in

db.mytable.insert(x=u'????')
1
Update - I get the following in the logs /usr/local/lib/python2.7/site-packages/pymysql/cursors.py:165: Warning: (1300, u"Invalid utf8 character string: 'F09F92'") result = self._query(query) /usr/local/lib/python2.7/site-packages/pymysql/cursors.py:165: Warning: (1366, u"Incorrect string value: '\\xF0\\x9F\\x92\\xA9' for column 'search_string' at row 1") . My pyMySQL is the most recent, 0.8.0.user2667066
Also, it does work if I use pyMySQL on its own (i.e. not via Web2Py)user2667066
I have reported this as a bug at github.com/web2py/web2py/issues/1838 - I hope that's the right thing to do.user2667066

1 Answers

1
votes

See "question mark" in Trouble with UTF-8 characters; what I see is not what I stored

See Python notes in http://mysql.rjweb.org/doc.php/charcoll#python . In particular, what are your connection parameters (between Python and MySQL)?

u"Invalid utf8 character string: 'F09F92'" implies that Python (perhaps from interpreting a MySQL utf8 error) is baffled by the 4-byte pile of poo.