Comprehensive Understanding of Oracle Database character set (1)

Last Update:2013-12-15 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

What is the Oracle character set?

The Oracle character set is a collection of symbols for interpretation of byte data. It has different sizes and an inclusive relationship.

Oracle supports the national language architecture, allowing you to store, process, and retrieve data in a localized language. It makes database tools, error messages, sorting order, date, time, currency, numbers, and calendar automatically adapt to localization languages and platforms.

The most important parameter that affects the character set of oracle databases is the NLS_LANG parameter. The format is as follows:

NLS_LANG = language_territory.charset

It has three components: language, region, and character set. Each component controls the NLS subset. Where:

Language specifies the Language of the server message, territory specifies the date and number format of the server, and charset specifies the character set. For example, AMERICAN _ AMERICA. ZHS16GBK.

From the composition of NLS_LANG, we can see that the real impact on the database character set is actually the third part. Therefore, if the character set between the two databases is the same as that in the third part, data can be imported and exported to each other. The preceding information is only prompted in Chinese or English.

How to query Oracle character sets

Many people have encountered data import failures due to different character sets. This involves three character sets: one is the character set on the El server side, the other is the character set on the oracle client side, and the other is the dmp file character set. During data import, the three character sets must be consistent before the data can be correctly imported.

1. query character sets of Oracle Server

There are many ways to find the character set of the oracle server. The intuitive query method is as follows:

SQL>select userenv(‘language’) from dual;

The result is as follows: AMERICAN _ AMERICA. ZHS16GBK.

2. How to query the dmp file Character Set

The dmp file exported using Oracle's exp tool also contains character set information. The 2nd and 3rd bytes of the dmp file record the character set of the dmp file. If the dmp file is not large, for example, only a few MB or dozens of MB, you can use UltraEdit to open the hexadecimal mode), read 2nd 3rd bytes of content, such as 0354, then, use the following SQL statement to find the corresponding character set:

SQL> select nls_charset_name(to_number('0354','xxxx')) from dual;  
ZHS16GBK

If the dmp file is large, for example, 2 GB or above, this is also the most common case), you can use the text editor to open it slowly or completely, you can use the following command on the unix host ):

cat exp.dmp |od -x|head -1|awk '{print $2 $3}'|cut -c 3-6

Then, you can use the preceding SQL statement to obtain its character set.

3. query the character set of the Oracle client

This is relatively simple. In Windows, the NLS_LANG corresponding to OracleHome in the Registry can also be set in the Dos window, for example:

set nls_lang=AMERICAN_AMERICA.ZHS16GBK

In this way, only the environment variables in this window are affected. On Unix platforms, it is the environment variable NLS_LANG.

$echo $NLS_LANG 
AMERICAN_AMERICA.ZHS16GBK

If the check result shows that the character sets on the Server and Client are inconsistent, change them to the same character set on the Server.

Modify the character set of Oracle

As mentioned above, oracle character sets have an inclusive relationship.

For example, us7ascii is a subset of zhs16gbk. From us7ascii to zhs16gbk, there will be no data interpretation problems or data loss. Utf8 should be the largest among all character sets, because it is based on unicode and double-byte characters are saved, so it occupies more space ).

Once a database is created, the character set of the database cannot be changed theoretically. Therefore, it is important to consider which character set to use at the beginning of design and installation. According to the official instructions of Oracle, Character Set conversion is from a subset to a superset, but not vice versa. If there is no relationship between Subsets and supersets between the two character sets, Character Set conversion is not supported by oracle. For database servers, incorrect Character Set modification may lead to many unpredictable consequences, which may seriously affect the normal operation of the database, therefore, before modification, check whether the two character sets have the relationship between Subsets and supersets. Generally, we do not recommend that you modify the character set of the oracle database server unless you have. In particular, the two most commonly used character sets ZHS16GBK and ZHS16CGB231280 do not have a subset or superset relationship. Therefore, in theory, mutual conversion between these two character sets is not supported.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Comprehensive Understanding of Oracle Database character set (1)

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support