Oracle Database - Character Set

Question

How can I keep the special characters in my package, when I compile using sqlplus or any other tool?

NLS_CHARACTERSET : WE8MSWIN1252
The client character´s set: unknown
The problem is: Data Presentation
OS and DB : UNIX \ Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production

Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
PL/SQL Release 12.1.0.2.0 - Production
CORE 12.1.0.2.0 Production
TNS for Linux: Version 12.1.0.2.0 - Production
NLSRTL Version 12.1.0.2.0 - Production

Example:

CREATE OR REPLACE PACKAGE BODY pkg1
AS
  procedure sp 
  IS
  v_var VARCHAR2(100);
  BEGIN
        v_var := '**último**';
  END;

   END;

When I do describe the package look like this:

CREATE OR REPLACE PACKAGE BODY HR.pkg1
AS
  procedure sp
  IS
  v_var VARCHAR2(100);
  BEGIN
        v_var := '**£ltimo**';     
  END;


   END;

So is the problem really that “The client character´s set: unknown”? Do you not know the NLS_LANG setting, or the operating system character set, or both? Is the string correct in the data dictionary (dumping it would tell you) and only displayed incorrectly - you seem to suggest that but how have you established it? — Alex Poole

collapsar collapsar · Accepted Answer · 2018-02-14T10:23:56

There is generic way to enter arbitrary unicode data into oracle databases limiting oneself to the ascii charset:

Determine the utf-8 representation of the target text as a sequence of bytes in hex.

Use an utf8-savvy editor and enter the text you wish to transfer to the db. Many editors come with a hex mode that allows you to inspect the utf8 representationand write it down someplace else. Alternatively use a dedicated hex editor or use cli tools like od (linux) to produce file content as a tabulated sequence of octets represented in hex:
```
 od --output-duplicates --format x1 --width=16 demo.txt
```
The output needs some postprocessing to get rid of the line numbers and blanks, however, that should be straightforward. For bulk data use sed or a similar tool.
Convert the hex string to oracle's raw datatype.

Calling HEXTORAW(x) in PlSql does the trick.
Convert the raw data to varchar2, ie. the databases character set.

Use the plsql package procedure utl_i18n.raw_to_char. Its second argument specifies the source encoding.

In summary, the method boils down to a query similar to (test string are the lowercase letters a through f followed by the german umlauts ) :

 select utl_i18n.raw_to_char ( hextoraw ( '616263646566c3a4c3b6c3bcc384c396c39cc39f' ), 'AL32UTF8' ) from dual;

In your package you would write:

 v_var := '**' || utl_i18n.raw_to_char ( hextoraw ( 'c3ba' ), 'AL32UTF8' )
               || 'ltimo**';
   -- c3ba being the utf8 encoding of unicode codepoint U+00fa,
   -- 'LATIN SMALL LETTER U WITH ACUTE'

Obviously the technique only works if the database character set is capable of representing the characters in the first place.

Shortcuts

You may use any other source encoding the special characters you need are representable in as long as the db is aware of that encoding. Namely the iso 8859-x series of encodings for latin alphabets may be used. In your case for ú and iso8859-1:
```
 v_var := utl_i18n.raw_to_char ( hextoraw ( 'fa' ), 'WE8ISO8859P1' ) from dual;
```
Using the National Character Set in your db, you may enter unicode characters by their code points in your literal strings. Just wrap the literal strings with the UNISTR function:
```
v_var := TO_CHAR(UNISTR('**\00faltimo**')) from dual;
```

Oracle Database - Character Set

1 Answers