During the test, the local Oracle installation adopts the utf8 character set, and the project requires the gbk character set. In order to prevent data information import and export from different character sets in the future,
During the test, the local Oracle installation adopts the utf8 character set, and the project requires the gbk character set. In order to prevent data information import and export from different character sets in the future,
During the test, the local Oracle installation adopts the utf8 character set, and the project requires the gbk character set. To prevent data information import and export from different character sets in the future, sort out the following documents.
The Oracle character set is modified and the new Oracle character set is installed in the AL32UTF8 format. However, a Project Export package is in the ZHS16GBK format. If you want to convert it, refer to the following:
1. What is the Oracle character set?
The Oracle character set is a collection of symbols for the interpretation of byte data. It can be divided into different sizes and have an inclusive relationship. Oracle supports the national language architecture, allowing you to store, process, and retrieve data in a localized language. It makes database tools, error messages, sorting order, date, time, currency, numbers, and calendar automatically adapt to localization languages and platforms.
The most important parameter that affects the character set of Oracle databases is the NLS_LANG parameter. The format is as follows:
NLS_LANG = language_territory.charset
It has three components (language, region, and Character Set), each of which controls the NLS subset features. Where:
Language specifies the Language of the server message, territory specifies the date and number format of the server, and charset specifies the character set. For example: AMERICAN _ AMERICA. ZHS16GBK
From the composition of NLS_LANG, we can see that the real impact on the database character set is actually the third part. Therefore, if the character set between the two databases is the same as that in the third part, data can be imported and exported to each other. The preceding information is only prompted in Chinese or English.
Ii. How to query Oracle character sets
Many people have encountered data import failures due to different character sets. This involves three character sets: one is the character set on the El server side, the other is the character set on the Oracle client side, and the other is the dmp file character set. During data import, the three character sets must be consistent before the data can be correctly imported.
1. query character sets of Oracle server
There are many ways to find out the character set of the Oracle server. The intuitive query method is as follows: SQL> select userenv ('language') from dual;
The result is similar to the following: AMERICAN _ AMERICA. ZHS16GBK (the local result is SIMPLIFIED CHINESE_CHINA.AL32UTF8)
2. How to query the dmp file Character Set
The dmp file exported using Oracle's exp tool also contains character set information. The 2nd and 3rd bytes of the dmp file record the character set of the dmp file. If the dmp file is not large, for example, only a few MB or dozens of MB, you can use UltraEdit to open it (in hexadecimal mode) and view the content of 2nd 3rd bytes, such as 0354, then, use the following SQL statement to find the corresponding character set:
SQL> select nls_charset_name (to_number ('20140901', 'xxxxx') from dual;
ZHS16GBK
If the dmp file is large, for example, 2 GB or above (this is also the most common case), you can use the following command (on a unix host) to open it slowly or completely without using a text editor ):
Cat exp. dmp | od-x | head-1 | awk '{print $2 $3}' | cut-c 3-6
Then, you can use the preceding SQL statement to obtain its character set.
3. query the character set of the Oracle client
This is relatively simple. In windows, HKEY_LOCAL_MACHINE \ SOFTWARE \ Oracle \ HOME0 \ NLS_LANG in the registry. You can also set it in the dos window, for example:
Set nls_lang = AMERICAN_AMERICA.ZHS16GBK
In this way, only the environment variables in this window are affected.
On unix platforms, the environment variable NLS_LANG is used.
$ Echo $ NLS_LANG
AMERICAN_AMERICA.ZHS16GBK
If the check result shows that the character sets on the server and client are inconsistent, change them to the same character set on the server.