IMP Import data in Oracle Chinese garbled problem (go)

Source: Internet
Author: User
Tags ultraedit

(Transferred from Http://blog.chinaunix.net/uid-186064-id-2823338.html)

Oracle in Imp Import Data Chinese garbled problem

After importing data into Oracle with the IMP command, all of the queried text fields are garbled.

    1. In principle, do not modify the server-side character set, modify the server-side character set will appear using third-party tools to log in the database garbled situation (specific server-side character Set modification method is described in detail).
    2. The character set of the DMP file is changed to the same as the Oracleo database server, and the import is displayed normally. My system is Rhel 5.4 32bit
I. What is the Oracle character set

The Oracle character set is a collection of symbols that interpret a byte of data, have a size, and have a mutual containment relationship. Oracle's support for national language architectures allows you to store, process, and retrieve data using localized languages. It enables database Tools, error messages, sort order, date, time, currency, numbers, and calendars to automatically adapt to localized languages and platforms.

The most important parameter that affects the Oracle database character set is the Nls_lang parameter. It has the following format:

Nls_lang = Language_territory. CHARSET

It has three components (language, geography, and character set), each of which controls the characteristics of the NLS subset. which

Language Specifies the language of the server message, territory specifies the date and number format of the server, CharSet specifies the character set. such as: AMERICAN _ AMERICA. UTF8

From the composition of Nls_lang we can see that the real impact of the database character set is actually the third part. So the character set between the two databases can import and export data to each other as long as the third part, the only thing that affects the message is the Chinese or English.

Second, how to query the character set of Oracle

Many people have encountered situations in which data import failed because of different character sets. This involves a three-part character set:

    1. Oracel the server-side character set.
    2. The character set of the Oracle client side.
    3. The character set of the DMP file.

When doing data import, it is necessary that these three character sets are imported uniformly before they are garbled.

Querying the character set of the Oracle server side

There are many ways to identify the Oracle server-side character set, and the more intuitive query method is the following:

Sql> Select Userenv (' language ') from dual;

USERENV (' LANGUAGE ')

----------------------------------------------------

American_america. UTF8

Sql>

The results are similar to the following: American_america. UTF8

How to query the character set of a DMP file

The DMP file exported with the Oracle Exp tool also contains character set information, and the 2nd and 3rd bytes of the DMP file record the character set of the DMP file. If the DMP file is small, such as only a few m or dozens of m, you can open it with UltraEdit (16 binary), look at the 2nd 3rd byte of content, such as 0354, and then use the following SQL to isolate its corresponding character set:

Sql> Select Nls_charset_name (To_number (' 0354 ', ' xxxx ')) from dual;

Zhs16gbk

If the DMP file is large, such as more than 2G (which is also the most common case), with a text editor opened very slowly or completely open, you can use the following command (on the UNIX host):

Cat Www.yeserver.com.dmp |od-x|head-1|awk ' {print $ |cut-c} ' 3-6

$ cat www.yeserver.com.dmp |od-x|head-1|awk ' {print $ |cut-c} ' 3-6

0345

$

The corresponding character set can then be obtained using the SQL above.

Querying the character set of the Oracle client side

Under the Linux/unix platform, it is the environment variable Nls_lang.

$echo $NLS _lang

American_america. UTF8

Under Windows platform, Hkey_local_machine\software\oracle\home0\nls_lang inside the registry. It can also be set in a DOS window, for example: Set Nls_lang=american_america. UTF8

This will only affect the environment variables in this window.

make sure that the server End and Client consistent with the end character set.

Third, modify the character set of Oracle

Oracle's character set has a mutually inclusive relationship. such as Us7ascii is a subset of ZHS16GBK, from Us7ascii to ZHS16GBK there is no data interpretation of the problem, there will be no data loss. UTF8 should be the largest in all character sets because it is Unicode-based, double-byte-saving characters (and therefore more storage space).

Once the database is created, the character set of the database is theoretically immutable. According to Oracle's official instructions, the conversion of character sets is from subset to superset, not vice versa. If there are no subsets and superset relationships between the two character sets, then the conversion of the character set is not supported by Oracle. Be sure to verify that there are subsets and superset relationships between the two character sets before you modify them. In general, we do not recommend modifying the character set of the Oracle database server side unless it is a last resort.

In particular, there is no subset and superset relationship between the two character sets zhs16cgb231280 and ZHS16GBK that we use most often, so it is theoretically not supported to convert between the two character sets.

For a reference to the character set, see Oracle's official notes:

Http://download.oracle.com/docs/cd/B19306_01/server.102/b14225/applocaledata.htm

Modifying the server-side character set (not recommended)

Prior to Oracle 8, you could change the character set of a database by directly modifying the data dictionary table props$. But after Oracle8, at least three system tables recorded the database character set information, only the props$ table is not complete, can cause serious consequences. The correct method of modification is as follows:

$sqlplus "/as sysdba"

Execute the shutdown immediate command to shut down the database server first.

Then execute the following command:

Sql>startup MOUNT;

Sql>alter SYSTEM ENABLE RESTRICTED SESSION;

Sql>alter SYSTEM SET job_queue_processes=0;

Sql>alter SYSTEM SET aq_tm_processes=0;

Sql>alter DATABASE OPEN;

Sql>alter DATABASE CHARACTER SET internal_use UTF8; Skipping hyper-subset detection

Sql>alter DATABASE National CHARACTER SET INTERNAL UTF8;

This line does not work, error after execution Ora-00933:sql command did not end correctly, but the execution of the previous line of command has been in effect, other articles do not mention the bank.

Sql>shutdown IMMEDIATE;

sql>startup;

Modifying the DMP file character set

The 2nd 3rd byte of the DMP file records the character set information, so directly modifying the contents of the 2nd 3rd byte of the dmp file can fool the Oracle's inspection. In theory, it is only possible to modify from subset to superset , but in many cases it can be modified without a subset or superset relationship, and some of the character sets we use, such as us7ascii,we8iso8859p1,zhs16cgb231280, ZHS16GBK basic can be changed. Because the change is only the DMP file, so the impact is not small.

The specific modification method is more, the simplest is to modify the DMP file's 2nd and 3rd bytes directly with UltraEdit. For example, to change the character set of the DMP file to UTF8, you can use the following SQL to isolate the 16 code corresponding to that character set:

Sql> Select To_char (nls_charset_id (' UTF8 '), ' XXXX ') from dual;

Sql> Select To_char (nls_charset_id (' UTF8 '), ' XXXX ') from dual;

To_ch

-----

367

Sql>

Then change the DMP file 2, 3 bytes to 0367.

If the DMP file is large and cannot be opened with UltraEdit, you need to use a different method.

IMP Import data in Oracle Chinese garbled problem (go)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.