I'd like to write a clojure function that takes a string in one encoding and converts it to another. The iconv library does this.
For example, let's look at the character "è". In ISO-8859-1 (http://www.ascii-code.com/), that's e8
as hex. In UTF-8 (http://www.ltg.ed.ac.uk/~richard/utf-8.cgi?input=%C3%A8&mode=char), it's c3 a8
.
So let's say we have iso.txt, which contains our letter and EOL:
$ hexdump iso.txt
0000000 e8 0a
0000002
Now we can convert it to UTF-8 like this:
$ iconv -f ISO-8859-1 -t UTF-8 iso.txt | hexdump
0000000 c3 a8 0a
0000003
How should I write something equivalent in clojure? I'm happy to use any external libraries, but I don't know where I'd go to find them. Looking around I couldn't figure out how to use libiconv itself on the JVM, but there's probably an alternative?
Edit
After reading Alex's link in the comment, this is so simple and so cool:
user> (new String (byte-array 2 (map unchecked-byte [0xc3 0xa8])) "UTF-8")
"è"
user> (new String (byte-array 1 [(unchecked-byte 0xe8)]) "ISO-8859-1")
"è"
e8
to the string which is the unicode character 'è'? – spike