0
votes

I just migrated from Ruby 1.8.7 to Ruby 1.9.2 and I keep stumbling on string encoding headaches. My MySQL database is all utf8, and yet every time I query from it and lookup the string encoding, I get ASCII-9BIT.

ruby-1.9.2-p180 :002 > Artist.find(1043).name.encoding
 => #<Encoding:ASCII-8BIT> 

So an artist like Sigur Rós prints out as "Sigur R\xC3\xB3s". Turns out this is causing some problems as my client app expects UTF8 json to be returned by my server (as it's always been the case). A temporary workaround seems to be adding force_encoding("UTF-8") all over my code but it feels extremely messy especially since Ruby 1.8.7 didn't need any of these.

I tried adding encoding magic comments, Encoding.default_external to my env file, encoding parameter to my database.yml file, nothing does it.

How am I supposed to deal with Ruby 1.9.x and string encodings?

--

EDIT: In my other Rails app (that's been Ruby 1.9.2 since the very start), all the MySQL strings seems to be encoded in UTF-8. But the database/tables encoding/charset are exactly the same?!

1
how do you connect to your MySQL database? do you set a connection-string? did you specify the charset? CharSet=UTF8;phoet
@phoet like I said, I do have the encoding param in my database.yml file. All my tables have the encoding/charset set to utf8.samvermette
like you said. but i am not talking about your database.yml file. i am talking about the connection string that you are setting when talking to your database via the database driver that you use, which is either mysql or mysql-2 gem. both behaving differently.phoet
@phoet whoops, sorry. I use the mysql2 gem. I don't recall any "connection string", I just use ActiveRecord directly...samvermette

1 Answers

0
votes

Well, updating to Rails 3.1.3 and mysql2 0.3.10 seems to have solved my issue (was running Rails 3.0.3 and mysql2 0.2.6). Seems weird to me though, as Ruby 1.9 is over 3y old and Rails 3.0.3 was released way after that, so I don't see why Rails 3.0.x wouldn't play nice with Ruby's 1.9 new string encodings. If anyone can add up on this I would be grateful.