3
votes

I am using an Oracle database with ISO-8859-1 data. When I try to get String from this DB using ResultSet and print result to console, I get a wrong encoding output.

Locale.getDefault(); // -> fr_FR
Charset.defaultCharset(); // -> UTF-8

But I tried to print these data from my ResultSet :

rs.getString("MY_COL"); // direct from ResultSet
new String(rs.getString("MY_COL").getBytes(Charset.forName("ISO-8859-15")), Charset.forName("UTF-8")); // convert ISO bytes to UTF-8 bytes

This output :

générale
générale

So, why Oracle JDBC driver create String with ISO-8859-1 bytes encoding ? How can I get String with UTF-8 bytes encoding without altering database (nor converting String) ? Can I change it from the driver configuration ou JMV args ?

1
It is also possible that the string is stored incorrectly inside the database.Mark Rotteveel
How can I verify that ?Aure77
Check the value in SQL*plus, toad or another tool?Mark Rotteveel
Execute following select select dump(my_col) from my_table to get the bytes stored in the database.SubOptimal
stackoverflow.com/a/18080993/836215 - check databases internal encodingibre5041

1 Answers

2
votes

I guess your database is not in ISO 8859-1 (NLS_CHARACTERSET = WE8ISO8859P1).

On the database

create table foo (col1 varchar2(40));
insert into foo values('é');
insert into foo values(chr(233));
select dump(col1) from foo;

should return

Typ=1 Len=1: 233 
Typ=1 Len=1: 233 

If you get for example

Typ=1 Len=2: 195,169
Typ=1 Len=1: 233

then your database is set up for UTF8 (NLS_CHARACTERSET = AL32UTF8).