I have a large file with two different encodings. The "main" file is UTF-8, but some characters like <80>
(€ in isoxxx) or <9F>
(ß in isoxxx) are in ISO-8859-1 encoding. I can use this to replace the invalid characters:
string.encode("iso8859-1", "utf-8", {:invalid => :replace, :replace => "-"}).encode("utf-8")
The problem is, that I need this wrong encoded characters, so replacing to "-" is useless for me. How can i fix the wrong encoded characters in the document with ruby?
EDIT: I've tried the :fallback
option, but with no success (no replacements where made):
string.encode("iso8859-1", "utf-8",
:fallback => {"\x80" => "123"}
)