how to get the MySQL Command-Line Tool to display Unicode properly?

Question

I use a Python program to write text containing Unicode characters to a MySQL database. As an example, two of the characters are

u'\u2640' a symbol for Venus or female
u'\u2642' a symbol for Mars or male

I use utf8mb4 for virtually all character sets involved with MySQL. Here is an excerpt from /etc/mysql/my.cnf

[client]
default-character-set=utf8mb4

[mysql]
default-character-set=utf8mb4

[mysqld]
default-character-set=utf8mb4
character-set-server =utf8mb4
character_set_system =utf8mb4

In addition, all tables are created with these parameters:

ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci

In all respects except one, the treatment of Unicode works just fine. I can write Unicode to database tables, read it, display it, etc., with no problems. The exception is mysql, the MySQL Command-Line Tool. When I execute a SELECT statement to see rows in a table containing the Venus and Mars Unicode characters, here is what I see on the screen:

| Venus     | â™€      |
| Mars      | â™‚      |

What I should see in the right column are the standard glyphs for Venus and Mars.

Any ideas about how to get the MySQL Command-Line Tool to display Unicode properly?

Edit:

I have done a fair amount of research into the various MySQL system variables, etc., and I now realize that the my.cnf settings shown above have some serious issues. In fact, the server, mysqld, would not launch with the settings shown. To correct things, remove these from [mysqld]:

default-character-set=utf8mb4
character-set-system=utf8mb4

I'm not sure that the [client] option does anything, but it doesn't seem to hurt.

In Python u'\u2640' represents a single Unicode character, namely "♀". This compiles down to three bytes containing the hex value E29980. I am having no problems at all encoding and decoding Unicode. The correct values are being stored in a MySQL table; they are correctly read from the table, and when displayed by a Python program they show up like this:

♀   Venus
♂   Mars

The program output can be redirected to a file, processed by a text editor, etc., and in all cases the correct Unicode symbol is displayed.

There is only one place where the correct Unicode symbol is not displayed, and that is when I am using the MySQL Command Line Tool. When I issue a SELECT statement on the table containing the Unicode symbols I get the junk shown above. This is not a Windows specific issue. I have exactly the same problem with the MySQL Command Line Tool when I run it on Windows, Mac OS X, and Ubuntu.

I suggest using the HEX function to find out what bytes are actually stored in the column. SELECT symbol_name, HEX(symbol_bytes) FROM ... For the Venus unicode character, properly encoded in UTF8, we'd expect E29980. — spencer7593

Rick James Rick James · Accepted Answer · 2017-10-18T01:12:53

Windows cmd and utf8. If you are talking about Windows, then chcp 65001, plus picking the right font is sufficient. See details.

Mojibake. But, on the other hand, if you are complaining about "Mojibake" such as â™€ instead of ♀, then see Mojibake in here. The hex for Venus (aka Female Sign), when correctly stored in utf8 will be E29980. If you see C3A2 E284A2 E282AC, you have "double encoding", not simply Mojibake.

Do not use u'\u2640' anywhere in MySQL.

how to get the MySQL Command-Line Tool to display Unicode properly?

3 Answers