I. Oracle character encoding scheme
1. single-byte character set:
In a single-byte character set, each character occupies only one byte. The single-byte 7-bit encoding scheme can contain a maximum of 128 (2 ^ 7) characters. The single-byte 8-bit encoding scheme can contain a maximum of 256 (2 ^ 8) characters.
Example of single-byte scheme:
7-bit Character Set: US 7-bit ASCII code (us7ascii) 8-bit Character Set: Western Europe ISO 8859-1 code (we8iso8859p1) Western Europe 8-bit ebcdicCodePage 500 (we8ebcdic500) Western Europe 8-bit Dec (we8dec)
2. Fixed-width multi-Byte Character Set
In addition to the fixed number of bytes format for each character, the fixed-width multi-byte character sets provide support similar to those provided by the multi-byte character set. This provides a uniform byte length representation for each character.Oracle only supports one fixed-width multi-byte character set, which is only located in the National Character Set al16utf16.
Example of Multi-Byte Character Set with fixed width: al16utf16 and 16-bit Unicode (double-byte Unicode with fixed width)
3. Variable-width multi-Byte Character Set
In a variable-width multi-Byte Character Set, each character is expressed in one or more bytes. Multi-byte character sets are generally used to support the Asian language. Some multibyte encoding schemes use the most effective bit values to indicate whether a single byte represents a single byte or a part of a series of bytes representing a character. However, other character encoding schemes can distinguish between single-byte and multi-byte characters. Before the code is moved into, the Code sent by the device indicates that the subsequent bytes are double-byte characters.
Example of a variable-width multibyte scheme: Japanese extended UNIX code (jeuc); Chinese GB2312-80 (CGB2312-80); al32utf8 (UTF-8)
4. Unicode Character Set
Unicode is a global character encoding standard that represents all characters used in a computer, including technical symbols and published characters. The Unicode standard version 3.0 contains 49,149 characters and contains more than 1 million characters. Unicode full set of characters can be expressed in different encoding formats. UTF-16 (general character set conversion format) is a double-byte format with fixed width, while UTF-8 is a variable-width multi-byte format.
Oracle provides al32utf8, utf8, and utfe as database character sets, while al16utf16 and utf8 as national character sets. The advantage of UTF-8-based character sets is that they include ASCII using the same single-byte encoding. Utf8 is an ASCII superset. Therefore, when you upgrade the ascii-based character set to Unicode, porting the database character set becomes easier.
II. Database Character Set and National Character Set
| Database Character Set |
National Character Set |
| Defined at creation |
Defined at creation |
| It cannot be changed unless it is created again. |
It cannot be changed unless it is created again. |
Used to store char, varchar2, clob, long, and other data types |
Data columns with nchar, nvarchar2, and nclob Storage types |
| Used to indicate table names, column names, and PL/SQL variables. |
The National Character Set is essentially an additional character set selected for Oracle. It is mainly used to enhance the character processing capability of Oracle, because the nchar data type can provide a fixed length multi-byte encoding for Asia. But the database character set cannot. |
| Character sets with variable width can be stored |
Unicode can be stored in al16utf16 or utf8 format |
3. parameters related to Character Set NLS _
3.1nls dictionary View
- Nls_database_parameters: displays the current NLS parameter values of the database, including the database character set values
- Nls_session_parameters: displays the parameters set by nls_lang or the value of the parameters changed by alter SESSION (excluding the client Character Set set by nls_lang)
- Nls_instance_paramete: displays the parameters defined by the parameter file init <Sid>. ora.
- V $ nls_parameters: displays the current NLS parameter values of the database.
3.2 modify NLS parameters:
- Modify the initialization parameter file used for instance startup
- Modify the environment variable nls_lang.
- Use the alter session Statement to modify
- Use some SQL Functions
NLS priority: SQL function> alter session> environment variables or registry> parameter files> default database Parameters
3.3 nls_lang format:Nls_lang =<Language >_< territory>. <client Character Set>
Language: displays the Oracle message, validation, and date name.
Territory: Specifies the default date, number, currency, and other formats
Client Character Set: Specifies the character set that the client will use
Note: ViewNls_database_parameters
Example of modifying NLS parameters:
1 -- Take alter session as an example: 2 -- View current session parameters 3 SQL > Select * From Nls_session_parameters; 4 5 Parameter Value 6 -- ------------------------------------------------ 7 Nls_language American 8 Nls_territory America 9 Nls_currency $ 10 Nls_iso_currency America 11 Nls_numeric_characters ., 12 Nls_calendar Gregorian 13 Nls_date_format dd - Mon - Rr 14 Nls_date_language American 15 Nls_sort Binary 16 Nls_time_format HH. Mi. ssxff AM 17 Nls_timestamp_format dd - Mon - Rr HH. Mi. ssxf 18 19 Parameter Value 20 -- ------------------------------------------------ 21 F am 22 23 Nls_time_tz_format HH. Mi. ssxff am tzr 24 Nls_timestamp_tz_format dd - Mon - Rr HH. Mi. ssxf 25 F am tzr 26 27 Nls_dual_currency $ 28 Nls_comp Binary 29 Nls_length_semantics byte 30 Nls_nchar_conv_excp false 31 32 17 Rows selected
1 -- View time format 2 SQL > Select Last_name, hire_date From HR. employees; 3 4 Last_name hire_date 5 -- -------------------------------- 6 King 17 - Jun - 87 7 Kochhar 21 - SEP- 89 8 De Haan 13 - Jan - 93
1 -- Change the nls_language Parameter 2 SQL > Alter Session Set Nls_language = Italian; 3 4 Session altered. 5 6 -- Production view time format after parameter change 7 SQL > Select Last_name, hire_date From HR. employees; 8 9 Last_name hire_date 10 -- -------------------------------- 11 King 17 - Giu - 87 12 Kochhar 21 - Set - 89 13 De Haan
-- The time format is changed.