11
votes

Rails 2.3.5 and Ruby 1.8.7 and Mysql 5.1.53

Im loading in a csv file, it has a field that TM symbol in it (trademark)

Tart Deco™ - looks like this

I am trying to do an active record find:

Influencer.find(:first,:conditions => ["author_name = ? and url_discovered = ?",author_name,site_profile_url])

Mysql::Error: Illegal mix of collations (latin1_swedish_ci,IMPLICIT) and (utf8_general_ci,COERCIBLE) for operation '=': SELECT * FROM influencers WHERE (author_name = 'Tart Deco?' and url_discovered = 'http://www.joelnylund.com') LIMIT 1

In ruby debugger the String shows up as:

p author_name "Tart Deco\231"

My table is encoded "utf8_general_ci"

So what should I do? I dont really care to much if I store the TM, it would be nice, mainly I just dont want it to break...

4
yucky workaround for now, remove non ascii chars - stackoverflow.com/questions/1268289/…Joelio

4 Answers

16
votes

Make sure your table sports the uft8 character set:

alter table `influencers` convert to character set utf8 collate utf8_general_ci;

Note: your collation (utf8_general_ci) is not your encoding (character set) - a common misunderstanding with MySQL.

1
votes

Perhaps not the most helpful answer, but I just threw together a sample test in Rails 3 w/ Ruby 1.9.2 and it worked fine. I understand character encoding was majorly overhauled in Ruby 1.9.

Note: In my test I just copied your text and in rails console

:001 > author_name = 'Tart Deco™'
=> "Tart Deco™"
:002 > Influencer.find(:first,:conditions => ["author_name = ?", author_name])
=> nil 

Of course, I just fudged an Influencer model w/ no records. But MySQL didn't barf. So, I added a record to my model w/ that name and it retrieved fine.

HTH

1
votes

your problem is somewhat related to ruby 1.8 and its characters encoding management. try to force conversion of text read from CSV, from ASCII to UTF8, using Iconv.

simple example:

 csv_text = SOME_CSV_READ_CODE # get the string or text
 # instantiate a Iconv converter from generic ASCII to UTF8, ignoring errors
 converter = Iconv.new('UTF-8//IGNORE', 'ASCII//IGNORE')
 # clean string from unrecognized chars 
 utf8_text = converter.iconv(csv_text).unpack('U*').select{ |cp| cp < 127 }.pack('U*')

in your case, you should convert (at least) author_name parameter before using it in your query. be sure to do the same when saving text/strings on db, or you'll get similar errors during inserts/updates.

0
votes

I wanted to ask a question about how you are loading the data in question but apparently the 'edit queue is full'. Are you using rails to load up the file and push it into the database or are you using one of the MySQL options like 'load data... infile...'?