0
votes

We have some data of type lob and varchar2 which is wrongly encoded to AL32UTF8and we need to convert its encoding to we8mswin1252 character set. I got convert function which is used for character set conversion and this function is working fine with varchar2 datatype but its giving some fuzzy characters when used with CLOB. My current encoding is AL32UTF8.

 select value,CONVERT(value,'we8mswin1252','AL32UTF8'),CONVERT(to_clob(value),'we8mswin1252','AL32UTF8') from temp;

enter image description here

Database: Oracle 12c

select * from nls_database_parameters where parameter like  '%CHARACTERSET%';

enter image description here

1
What are your database and national character sets? The documentation mentions limitations on the convert function.Alex Poole
@alex Updated the question with required details. I am using Oracle 12c and nls character set is NLS_NCHAR_CHARACTERSET => AL16UTF16 NLS_CHARACTERSET => AL32UTF8Bhanu Yadav
Then as the docs say, "Any dest_char_set that is not one of these two character sets is unsupported." I don't think any of the other approaches support CLOBs directly. You may have to cut your CLOB into varchar2 chunks, convert those, and then stick them back together?Alex Poole
As you can see in the value column all the characters are plain english character except the "-" which is a special character through this character is available in both the encoding, what I dont understand is why its converting all characters to fuzzy characters instead of replacing only that special character.Bhanu Yadav
It's still trying to convert every character from one character set to the other. I have no idea why it works (for now) with varchar2 but not with CLOB with that particularly character set combination; but as what you are doing is unsupported I'm not sure it matters. Also, from MoS doc 1628060.1 and elsewhere, "It is technically possible , but unsupported, to store data / languages / characters who are not defined in the NLS_CHARACTERSET in CHAR, VARCHAR2, LONG or CLOB datatypes." Your starting point, that your data is wrongly encoded, doesn't sound correct. Also see doc 158577.1.Alex Poole

1 Answers

0
votes

Answer I got from Oracle: "The CONVERT function is a legacy function that should not be used for any application-related character set conversion. CLOB is especially tricky as it uses a special storage encoding in multibyte databases, such as AL32UTF8. CONVERT is useful for some repair tasks only. The supported way of processing data in character sets other than the database character set is to use the RAW data type and the conversion procedures from the package UTL_I18N. However, UTL_I18N does not support CLOBs. Therefore, you do have to read CLOBs in 32K chunks, convert them to WE8MSWIN1252 with UTL_I18N.STRING_TO_RAW, do your post-processing and write the file with UTL_FILE.PUT_RAW."

Also I got another unix command iconv which converts file encoding so basically I was writing db output to file so converted its encoding and transfered the data.

Thanks