0
votes

I have a text file that has some ® (Registered Trade Mark) symbols in it. The file is in UTF-8.

I'm trying to import this file and populate a MySQL database using Rails 3. The DB appears to be setup fine to take UTF-8

+-------------+--------------+-----------------+------+-----+---------+----------------+---------------------------------+---------+
| Field       | Type         | Collation       | Null | Key | Default | Extra          | Privileges                      | Comment |
+-------------+--------------+-----------------+------+-----+---------+----------------+---------------------------------+---------+
| id          | int(11)      | NULL            | NO   | PRI | NULL    | auto_increment | select,insert,update,references |         |
| user_id     | int(11)      | NULL            | YES  | MUL | NULL    |                | select,insert,update,references |         |
| title       | varchar(255) | utf8_general_ci | YES  |     | NULL    |                | select,insert,update,references |         |
| translation | text         | utf8_general_ci | YES  |     | NULL    |                | select,insert,update,references |         |
| created_at  | datetime     | NULL            | NO   |     | NULL    |                | select,insert,update,references |         |
| updated_at  | datetime     | NULL            | NO   |     | NULL    |                | select,insert,update,references |         |
+-------------+--------------+-----------------+------+-----+---------+----------------+---------------------------------+---------+

Yet, when I try to do this:

trans_file = params[:descriptions] #coming from file_field_tag
trans = trans_file.read.split("\r\n")
trans.each do |tran|
  ttl = ''
  desc = ''
  tran.split(']=').each do |title|
    if title =~ /\[/ #it's the title
      ttl = title.sub('[','')
    else
      desc = title.gsub('FFF', "\r\n")
    end
  end
  @current_user.cd_translations.build(title: ttl, translation: desc).save

I'm getting the error "Action Controller: incompatible character encodings: ASCII-8BIT and UTF-8".

I'm using the utf-8 encoding in my application.rb file, and I'm using the mysql2 gem.

If I remove the Registered Trade Mark character, the error goes away. But it's not really an option to strip it out of the incoming text.

I tried the solution here: https://stackoverflow.com/a/5215676/102372, but that didn't make any difference.

Stack trace:

app/controllers/users_controller.rb:28:in `block in update_cd_translations'
app/controllers/users_controller.rb:15:in `each'
app/controllers/users_controller.rb:15:in `update_cd_translations'
config/initializers/quiet_assets.rb:7:in `call_with_quiet_assets'

How can I resolve this?

2
Can you show how you're opening / reading from the file? Have you set any of the default encodings? What does the stack trace look like?Frederick Cheung
What is the encoding of that controller file? You can see what ruby thinks it is by checking the encoding of any of the string literals in itFrederick Cheung
It's US-ASCII, but trying Lictamberg's answer below doesn't change the exception I'm getting.croceldon
What encoding does ruby think the strings coming from your uploaded file are?Frederick Cheung
Ah, it's showing ASCII-8BIT, is there a way to convert?croceldon

2 Answers

1
votes

It appears that ruby thinks that the encoding of the uploaded file is ascii-8bit (that is to say binary).

If you know the encoding of the file, you can use force_encoding! to change the encoding of the string (without transcoding). If you're not always going to be sure of the encoding of the file, the charguess gem can be used to guess it.

0
votes

Try to add a

# -*- encoding : utf-8 -*-

at the beginnig of each file, which is included in the whole process