View and modify Oracle Character Set

Source: Internet
Author: User
Tags set set
The Oracle character set is a collection of symbols for the interpretation of byte data. It can be divided into different sizes and have an inclusive relationship. ORACLE supports the national language architecture to allow you to use localized languages

The Oracle character set is a collection of symbols for the interpretation of byte data. It can be divided into different sizes and have an inclusive relationship. ORACLE supports the national language architecture to allow you to use localized languages

1. What is the Oracle character set?

The Oracle character set is a collection of symbols for the interpretation of byte data. It can be divided into different sizes and have an inclusive relationship. ORACLE supports the national language architecture, allowing you to store, process, and retrieve data in a localized language. It makes database tools, error messages, sorting order, date, time, currency, numbers, and calendar automatically adapt to localization languages and platforms.

The most important parameter that affects the character set of Oracle databases is the NLS_LANG parameter.

The format is as follows: NLS_LANG = language_territory.charset

It has three components (language, region, and Character Set), each of which controls the NLS subset features.

Where

Language: Specifies the Language of the server message, which affects whether the prompt information is in Chinese or English.

Territory: Specifies the date and number format of the server,

Charset: Specifies the character set.

For example, AMERICAN _ AMERICA. ZHS16GBK

From the composition of NLS_LANG, we can see that the real impact on the database character set is actually the third part.

Therefore, if the character set between the two databases is the same as that in the third part, data can be imported and exported to each other. The preceding information is only prompted in Chinese or English.

2. Knowledge about character sets:

2.1 Character Set

In essence, according to a certain character encoding scheme, assign a specific set of symbols to different numerical encoding sets. The earliest supported encoding scheme of Oracle Database is US7ASCII.

The character set naming rules of Oracle follow the following naming rules:

Languagebit sizeencoding

That is, bit encoding.

For example, ZHS16GBK uses the GBK encoding format and the 16-bit (two-byte) simplified Chinese character set.

2.2 character encoding scheme

2.2.1 single-byte encoding

(1) single-byte 7-bit character set, which can be 128 characters. The most common character set is US7ASCII.

(2) single-byte 8-bit character set, which can be defined as 256 characters, suitable for most European countries

Example: WE8ISO8859P1 (Western Europe, 8-bit, ISO standard 8859P1 encoding)

2.2.2 multi-byte encoding

(1) variable-length multi-byte encoding

Some characters are represented by one byte. Other characters are represented by two or more characters. Long-length multi-byte encoding is commonly used for Asian languages, such as Japanese, Chinese, and Hindi.

For example, AL32UTF8 (where AL stands for ALL, which applies to ALL languages), zhs16cgb231280

(2) fixed length multi-byte encoding

Each character uses a fixed-length multi-byte encoding scheme. Currently, the only fixed-length multi-byte encoding supported by oracle is AF16UTF16, which is only used for national character sets.

2.2.3 unicode encoding

Unicode is a single encoding scheme that covers all the known characters currently used around the world, that is, Unicode provides a unique encoding for each character. UTF-16 is a unicode 16-bit encoding method, a fixed length multi-byte encoding, with 2 bytes representing a unicode character, AF16UTF16 is the UTF-16 encoding character set.

UTF-8 is unicode 8-bit encoding, is a variable-length multi-byte encoding, this encoding can use 1, 2, 3 bytes to represent a unicode character, AL32UTF8, UTF8 and UTFE are UTF-8 encoded character sets

2.3 character set super

When the encoding value of A character set (character set A) contains the encoding value of all other character sets (Character Set B), and the same encoding value of the two character sets represents the same character, character Set A is the Super character of Character Set B, or Character Set B is the subset of Character Set.

In the official documents of Oracle8i and oracle9i, the subset-superset pairs table is provided. For example, WE8ISO8859P1 is a subset of WE8MSWIN1252. Because US7ASCII is the earliest Oracle Database encoding format, many character sets are supersets of US7ASCII. For example, WE8ISO8859P1, ZHS16CGB231280, and ZHS16GBK are US7ASCII supersets.

2.4 database character set (oracle Server Character Set)

The database character set is specified during database creation and cannot be changed after database creation. When creating a database, you can specify the character set and national character set ).

2.4.1 Character Set

(1) used to store CHAR, VARCHAR2, CLOB, LONG, and other data types

(2) used to mark table names, column names, and PLSQL Variables

(3) used to store SQL and PLSQL program units

2.4.2 National Character Set:

(1) used to store NCHAR, NVARCHAR2, NCLOB, and other data types

(2) The National Character Set is essentially an additional character set selected for oracle. It is mainly used to enhance the character processing capability of oracle, the NCHAR data type can support the use of fixed-length multi-byte encoding in Asia, while the database character set cannot. The National Character Set is redefined in oracle9i and can only be selected from AF16UTF16 and UTF8 in unicode encoding. The default value is AF16UTF16.

2.4.3 query character set parameters

You can query the following data dictionaries or views to view Character Set settings.

Nls_database_parameters, props $, v $ nls_parameters

In the query results, NLS_CHARACTERSET indicates the character set, and NLS_NCHAR_CHARACTERSET indicates the national character set.

2.4.4 modifying database character sets

As mentioned above, the database character set cannot be changed in principle after it is created. However, there are two feasible methods.

1. If you need to modify the character set, you usually need to export the database data, recreate the database, and then import the database data for conversion.

2. you can use the alter database character set statement to modify the character set. However, there are limits on modifying the character set after the DATABASE is created. Only when the new character set is the current character set, the character set of the DATABASE can be modified, for example, UTF8 is a superset of US7ASCII. You can use alter database character set UTF8 to modify the character set of a DATABASE.

2.5 client character set (NLS_LANG parameter)

2.5.1 client Character Set meaning

The client character set defines the encoding method of the client character data. Any character data sent from or to the client is encoded using the character set defined by the client, clients can be seen as applications that can be directly connected to databases, such as sqlplus and expimp. The client character set is set by setting the NLS_LANG parameter.

2.5.2 NLS_LANG parameter format

NLS_LANG = language_territory.client character set

Language: displays the oracle message, verification, and date name.

Territory: Specifies the default date, number, currency, and other formats

Client character set: Specifies the character set that the Client will use

Example: NLS_LANG = AMERICAN_AMERICA.US7ASCII

AMERICAN is the language, AMERICA is the region, and US7ASCII is the client Character Set

2.5.3 client Character Set setting method

1) UNIX environment

$ NLS_LANG = "simplified chinese" _ china. zhs16gbk

$ Export NLS_LANG

Edit the profile file of an oracle user

2) Windows

Edit Registry

Regedit.exe --- HKEY_LOCAL_MACHINE --- SOFTWARE --- ORACLE -- HOME

2.5.4 NLS parameter query

Oracle provides several NLS parameter customization databases and user machines to adapt to local formats, such as NLS_LANGUAGE, NLS_DATE_FORMAT, and NLS_CALENDER. You can query the following data dictionary or view in v $ view.

NLS_DATABASE_PARAMETERS: displays the current NLS parameter values of the database, including the database character set values

NLS_SESSION_PARAMETERS: displays the parameters set by NLS_LANG or the value of the parameters changed by alter session (excluding the client Character Set set by NLS_LANG)

NLS_INSTANCE_PARAMETE: displays the parameters defined by the parameter file initSID. ora.

V $ NLS_PARAMETERS: displays the current NLS parameter values of the database.

2.5.5 modify NLS Parameters

You can modify the NLS parameters using the following methods:

(1) modify the initialization parameter file used for instance startup

(2) modify the environment variable NLS_LANG

(3) Use the alter session Statement to modify

(4) use some SQL Functions

NLS role priority level: SQL function alter session environment variable or registry parameter file default database parameter

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.