I have a CCK type for storing mentions (Social Media search mentions). Some of the mentions I believe are ASCII (My knowledge of this stuff is little).
I retrieve data from API's, which I then using node_save to save to Drupal.
My question is, what should I use to safely convert whatever I am getting into a format Drupal and MySQL are happy with?
The particular db_query error I get is unhelpfull "Warning in test1\includes\common.inc on line 3538". Nice. I have traced it to be encoding, as I used the following code to make the input safe, but it is not working with all input.
$node->title = htmlentities($item['title'], ENT_COMPAT, 'UTF-8');
It worked well for some ASCII characters, like those square ones [] etc, but not for this "行けなくてもずっとユーミンは聴きつづけます".
I'm really stuck. :(
UPDATE: The EXACT error I get from PHP is "Warning in D:\sites\test1\includes\common.inc on line 3538", and the line reads "if (db_query($query, $values)) {".
UPDATE 2: I've confirmed that the encoding of the data I am receiving is UTF8. This really doesn't make sense now, and I've confirmed that the collation in the db is utf8_general_ci.
UPDATE 3: One of the title's is: How Much Does A Facebook Fan Cost?� $1.07
The output of:
var_export(array_map('ord', str_split($node->title))
gave me the character 160 for the funny question mark (which is a square like [] in eclipse).
UPDATE 4: MySQL version is 5.1.41, and the collation on the columns is utf8_general_ci.
UPDATE 5: I managed to get Drupal to print the query with db_queryd. Funny thing is now I get the exact error message and not "Warning in", but Drupal still doesn't have this error in the log! WTF. So the exact sql is:
INSERT INTO node (vid, type, language, title, uid, status, created, changed, comment, promote, moderate, sticky, tnid, translate) VALUES (0, 'sm_mention', '', 'How Much Does A Facebook Fan Cost?� $1.07 (Geoffrey A. Fowler/Digits)', 1, 1, 1298395302, 1298395302, 0, 0, 0, 0, 0, 0)
And the error given is: Incorrect string value: '\xA0 $1.0...' for column 'title' at row 1
This honestly sounds like something doesn't like extended ascii characters.
UPDATE 6:
SHOW CREATE TABLE node:
CREATE TABLE `node` (
`nid` int(10) unsigned NOT NULL AUTO_INCREMENT,
`vid` int(10) unsigned NOT NULL DEFAULT '0',
`type` varchar(32) NOT NULL DEFAULT '',
`language` varchar(12) NOT NULL DEFAULT '',
`title` varchar(255) NOT NULL DEFAULT '',
`uid` int(11) NOT NULL DEFAULT '0',
`status` int(11) NOT NULL DEFAULT '1',
`created` int(11) NOT NULL DEFAULT '0',
`changed` int(11) NOT NULL DEFAULT '0',
`comment` int(11) NOT NULL DEFAULT '0',
`promote` int(11) NOT NULL DEFAULT '0',
`moderate` int(11) NOT NULL DEFAULT '0',
`sticky` int(11) NOT NULL DEFAULT '0',
`tnid` int(10) unsigned NOT NULL DEFAULT '0',
`translate` int(11) NOT NULL DEFAULT '0',
PRIMARY KEY (`nid`),
UNIQUE KEY `vid` (`vid`),
KEY `node_changed` (`changed`),
KEY `node_created` (`created`),
KEY `node_moderate` (`moderate`),
KEY `node_promote_status` (`promote`,`status`),
KEY `node_status_type` (`status`,`type`,`nid`),
KEY `node_title_type` (`title`,`type`(4)),
KEY `node_type` (`type`(4)),
KEY `uid` (`uid`),
KEY `tnid` (`tnid`),
KEY `translate` (`translate`)
) ENGINE=InnoDB AUTO_INCREMENT=1700 DEFAULT CHARSET=utf8
file_put_contents('/tmp/log', var_export(array_map('ord', str_split($node->title)), TRUE));
or something like that. copy-paste is very unlikely to work... having the characters themselves would help. Then we can begin to look for the problem. Also give us MySQL version and check your tables and columns to have some utf8 collation. – chx