Oracle Character Set Analysis

Source: Internet
Author: User

Oracle Character Set Analysis

There are many types of character sets. The first character set was ASCII. Since the number of characters supported by ASCII is very limited, many encoding schemes, such as Unicode, were introduced later. Unicode is a single encoding scheme that covers all the known characters in the world. The common UTF-16 is unicode's 16-bit encoding method, which is a fixed length multi-byte encoding; while UTF-8 is unicode 8-bit encoding, is a variable length multi-byte encoding.

GBK and UTF-8 are the two most common character encoding schemes. GBK is a Chinese national standard for simplified Chinese character sets and is compatible with the GB2312 standard after expansion based on the National Standard GB2312, including all Chinese characters, GB2312 extension, mainly used for Chinese character encoding; and UTF-8 is used to solve the international character Universal multi-byte encoding, contains all the countries in the world need to use the characters, it can be displayed on various browsers that support the UTF8 character set in various countries and is universally used in the world. The ZHS16GBK stores one Chinese character based on the length of 2 characters. The UTF8 character set is stored in multiple bytes, and one Chinese character is sometimes stored in three characters in length.

When the encoding value of Character Set A contains the encoding value of Character Set B and the two character sets share the same encoding value, character set A is the superset of Character Set B, character Set B is A subset of Character Set. UTF8 is the strict superset of ZHS16GBK. Oracle allows the conversion from a subset to a superset, but does not allow the conversion from a subset to a subset.

The character set generally follows the naming rules of <language> <bit size> <encoding>. For example, AL16UTF16 (AL refers to All ages ). For the Chinese Character Set ZHS16GBK, Which is simplified Chinese, a character must be 16 bits and the standard Character Set Name Is GBK.

The Unicode Character Set supported by ORACLE is commonly used in AL32UTF8, which can accommodate multiple languages. However, for databases that only store English information, US7ASCII is generally used.

The following describes character sets in ORACLE: database character set, client character set, and client application character set.

The character set of an ORACLE database cannot be changed after it is created. During database creation, you must select the character set.

For Simplified Chinese platforms, the default character set is ZHS16GBK. In addition, common Chinese character sets include ZHS16CGB231280 (incompatible ).

Files related to the database character set are stored separately in the database $ ORACLE_HOME/nls/data. These files define the language (NLS_LANGUAGE), Region (NLS_TERRITORY), and character set (NLS_CHARCTERSET ).

Note the setting of the client character set when using the EXP/IMP derivative tool. The client character set is completed by setting NLS_LANG (language_territory.clients characterset. Language indicates the display of the language and date used by the Oracle message on the client; territory indicates the currency and number; characterset controls the character set used by the client application (such as sqlplus, this character set decodes the data transmitted by the database.

The client character set defines the encoding method of client character data. Any character data sent from or to the client is encoded using the character set defined by the client. The client can be seen as a variety of applications that can be directly connected to the database, for example, sqlplus, exp/imp. The client character set is set by setting the NLS_LANG parameter.

NLS_LANG is usually set to the same character set as the database character set. In this way, the exported data is not converted at the export end, and the data is completely backed up. During import, you can set the character set at which the import End NLS_LANG is equal to the exported character set, even if a conversion occurs. Conversion also determines that the imported database character set must be the superset of the exported database character set to be converted successfully.

To identify the character set of the exported file, use the ultraedit tool or cat xx. dmp | od-x | head-2 to view the 2nd and 3 bytes of the file header. Common Character sets include 0354 for ZHS16GBK, 0352 for ZHS16CGB231280, 0001 for US7ASCII, and 0367 for UTF8.

The database character set is queried through NLS_DATABASE_PARAMETERS or select userenv ('language ') from dual. The client character set is used to query the Registry NLS_LANG on the windows platform, and the Environment Variable NLS_LANG on the unix platform; the client application character set can be viewed by right-clicking the cmd attribute to determine the output display of the query on the terminal. It is generally a simplified Chinese GBK character set.

In oracle documents on character sets, four oracle parameters are common, mainly the first two:

NLS_DATABASE_PARAMETERS comes from props $, which indicates the character set of the database. It is set when you create a database. It is generally not changed.

V $ NLS_PARAMETERS: displays the current session of the database, which is affected by the client (it may be alter session, environmental variable, registry, or parameter file );

NLS_INSTANCE_PARAMETE is from v $ parameter (it indicates the character set setting of the client, which may be a parameter file, environmental variable, or registry .);

There are two possibilities for NLS_SESSION_PARAMETERS: displaying the parameters set by NLS_LANG or the parameter values after alter session Changes (equivalent to NLS_INSTANCE_PARAMETER );

-------------------------------------- Split line --------------------------------------

Install Oracle 11gR2 (x64) in CentOS 6.4)

Steps for installing Oracle 11gR2 in vmwarevm

Install Oracle 11g XE R2 In Debian

-------------------------------------- Split line --------------------------------------

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.