1
votes

I'm trying to insert web addresses to my database that contain scandic letters, for example:

ÄÖäöÅå

I'm using:

  • Opensuse 13.2 64bit Linux and MariaDB.
  • MySQL Server version: 5.5.44-MariaDB openSUSE package
  • PHP Version is 5.4.20

When I try to insert, I get this error message:

Incorrect string value: '\xC4HK\xD6.

This query confirms that the character set and collation is set correctly:

if (mysql_query("SET NAMES utf8mb4 COLLATE utf8mb4_unicode_ci")) {
    echo "Character set OK !";
}

My MySQL query works for everything except URLs that contain scandic letters:

if (mysql_query("INSERT INTO `table` (`address`) VALUES ('$URL')")){
    $insertCount++;
    echo "<br> insertcount = ".$insertCount."<br>";
} else {
    echo "MySQLerror = ".mysql_error()."<br>"; // Show MySQLerror

This is MySQL info from MariaDB, showing that everything is set to utf8mb4:

MariaDB [(none)]> SHOW VARIABLES WHERE Variable_name LIKE 'character\_set\_%' OR Variable_name LIKE 'collation%';

+--------------------------+--------------------+
| Variable_name            | Value              |
+--------------------------+--------------------+
| character_set_client     | utf8mb4            |
| character_set_connection | utf8mb4            |
| character_set_database   | utf8mb4            |
| character_set_filesystem | binary             |
| character_set_results    | utf8mb4            |
| character_set_server     | utf8mb4            |
| character_set_system     | utf8               |
| collation_connection     | utf8mb4_unicode_ci |
| collation_database       | utf8mb4_unicode_ci |
| collation_server         | utf8mb4_unicode_ci |
+--------------------------+--------------------+
10 rows in set (0,00 sec)

How can I correctly insert scandic letters?


Edit

@Monty: These are my database settings:

MariaDB [(none)]> show variables like '%colla%';
+----------------------+--------------------+
| Variable_name        | Value              |
+----------------------+--------------------+
| collation_connection | utf8mb4_unicode_ci |
| collation_database   | utf8mb4_unicode_ci |
| collation_server     | utf8mb4_unicode_ci |
+----------------------+--------------------+
3 rows in set (0,00 sec)

MariaDB [(none)]> show variables like '%charac%';
+--------------------------+------------------------------+
| Variable_name            | Value                        |
+--------------------------+------------------------------+
| character_set_client     | utf8mb4                      |
| character_set_connection | utf8mb4                      |
| character_set_database   | utf8mb4                      |
| character_set_filesystem | binary                       |
| character_set_results    | utf8mb4                      |
| character_set_server     | utf8mb4                      |
| character_set_system     | utf8                         |
| character_sets_dir       | /usr/share/mariadb/charsets/ |
+--------------------------+------------------------------+
8 rows in set (0,00 sec)

MariaDB [(none)]> 

Edit

@Rick James: This what I got back :

MariaDB [db]> SHOW CREATE TABLE table; +--------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Table | Create Table | +--------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | table | CREATE TABLE table ( addr varchar(150) COLLATE utf8mb4_unicode_ci NOT NULL, PRIMARY KEY (addr), UNIQUE KEY addr (addr) ) ENGINE=MyISAM DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci COMMENT='List' | +--------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 1 row in set (0,00 sec)

MariaDB [db]>

2
If you can, you should stop using mysql_* functions. They are officially deprecated. These extensions have been removed in PHP 7. Learn about prepared statements instead, and consider using PDO, it's really not hard.Jay Blanchard
+Jay Blankhard I have not got that far yet that i can prevent SQL injections, but i will do that as soon as i get this thing solved. Thanks mate.Sparky
Have you verified that your PHP file is saved as utf-8?JoSSte

2 Answers

0
votes

Try this

Verify that the tables where the data is stored have the utf8 character set:

SELECT
  `tables`.`TABLE_NAME`,
  `collations`.`character_set_name`
FROM
  `information_schema`.`TABLES` AS `tables`,
  `information_schema`.`COLLATION_CHARACTER_SET_APPLICABILITY` AS `collations`
WHERE
  `tables`.`table_schema` = DATABASE()
  AND `collations`.`collation_name` = `tables`.`table_collation`
;

check your database settings:

show variables like '%colla%';
show variables like '%charac%';

Change utf-8 to utf8_general_ci

ALTER TABLE tablename CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;
0
votes

C4 and D6 are latin1 hex for Ä and Ö.

Please do SHOW CREATE TABLE to see what CHARACTER SET is set for the column in question. I suspect it is incorrectly latin1.

And, yes, you must switch away from mysql_* interface.