0
votes

Using open() from Ruby's open-uri, I want to fetch files from arbitrary servers not under my control. Servers may specify a Content-Type for a file, e.g. text/calendar; charset=utf-8 or text/calendar; charset=ISO-8859-1, in which case I am glad that open() will believe that the charset is whatever the server claims. However, if the server does not specify a charset, then open() seems to assume the charset is "ASCII-8BIT." I want to make open() instead assume the charset is "UTF-8" (when no charset is specified), since text/calendar, i.e. "iCal files," should normally be encoded as "UTF-8."

I put emphasis on only assuming the charset when no charset is specified, because I still want to respect servers' decisions to optionally serve files in whatever charset they please.

I tried open('http://my-test-uri.test', 'r:UTF-8'), but that unconditionally overrides the charset, even if the server specifies a different charset like "ISO-8859-1."

1

1 Answers

0
votes

OpenURI::Meta#charset accepts a block which will return a charset only if the server did not specify one.

Using that information, we can set the encoding of the StringIO returned by open to either the same encoding it had (redundantly) or to our default:

open('http://localhost:3333').tap do |io|
  charset = io.charset { 'utf-8' }
  io.set_encoding(charset)
end