Oracle Character Set conversion during data migration

Source: Internet
Author: User
Tags oracle documentation
According to the official oracle documentation, the database character set cannot be changed after it is created. Therefore, it is important to consider which character set to use at the beginning of design and installation. For Databases

According to the official oracle documentation, the database character set cannot be changed after it is created. Therefore, it is important to consider which character set to use at the beginning of design and installation. For Databases

According to the official Oracle documentation, the database character set cannot be changed after it is created. Therefore, it is important to consider which character set to use at the beginning of design and installation. For the database server, modifying the character set incorrectly may cause many unmeasurable consequences, which may seriously affect the normal operation of the database. however, since each software company uses the same character set, Character Set conversion is certainly encountered during data migration. for example, the most common problem is that the English character set is converted to Chinese, and the Chinese character is converted to English. it also contains hundreds of GB of dmp files during data migration.
During data migration, I used two methods: modifying the dmp file Character Set and modifying the database character set.
Modify the dmp file character set. The 2nd 3rd bytes of the dmp file records the character set information. Therefore, you can directly modify the 2nd 3rd bytes of the dmp file to 'Cheat 'the oracle check. In theory, this can be modified only from the subset to the superset, but in many cases it can be modified without the subset and superset relationships. Some of our commonly used character sets, such as US7ASCII, WE8ISO8859P1, ZHS16CGB231280, and ZHS16GBK can be modified. Because only the dmp file is changed, it has little impact.
There are many specific modification methods. The simplest is to directly use UltraEdit to modify the 2nd and 3rd bytes of the dmp file.
For example, if you want to change the dmp file's character set to ZHS16GBK, you can use the following SQL statement to find the hexadecimal code corresponding to this character set: SQL> select to_char (nls_charset_id ('zhs16gbk '), 'xxxx') from dual;
0354 then, change the 2 and 3 bytes of the dmp file to 0354. However, this method can only be modified using the dmp exported by clients of oracle8 or earlier versions.
One is to modify the database character set, because the official document indicates that the database cannot be modified after it is created, but it is not feasible to create an intermediate database to use the character set of the source database for creation during data migration.
Import the data from the source database to the intermediate database, modify the character set of the intermediate database so that it can match the character set of the target database, and then import the data to the target database from the intermediate database. this is very troublesome for me.
The target database has no business data, so the character set of the target database is directly modified to import the data of the source database.
--- 1 --- modify the character set in the database
--- Chinese Character Set to English character set
-- English
Update sys. props $ set value $ = 'us7ascii 'where;

Commit;

Update sys. props $ set value $ = 'us7ascii 'where;
Commit;

--- 2 --- modify the character set in the forced synchronization Database
Shutdown immediate;

Startup mount;

Alter system enable restricted session;

Alter system set JOB_QUEUE_PROCESSES = 0;

Alter system set AQ_TM_PROCESSES = 0;

Alter database open;

--- Assign a value to the CHARSET variable
Col value NEW_VALUE CHARSET
Select value from NLS_DATABASE_PARAMETERS where parameter = 'nls _ CHARACTERSET ';

Col value NEW_VALUE NCHARSET
Select value from NLS_DATABASE_PARAMETERS where parameter = 'nls _ NCHAR_CHARACTERSET ';

-- INTERNAL_USE is a parameter that is not written in the document and is used to force character set consistency.
Alter database character set INTERNAL_USE & CHARSET;
-- Alter database national character set INTERNAL_USE & NCHARSET;

-- Start the database again
Shutdown immediate;
STARTUP;

Set nls_lang = AMERICAN_AMERICA.US7ASCII;

--- English character set to Chinese Character Set

Update sys. props $ set value $ = 'zhs16gbk' where;

Commit;

Update sys. props $ set value $ = 'zhs16gbk' where;

Commit;

--- 2 --- modify the character set in the forced synchronization Database
Shutdown immediate;

Startup mount;

Alter system enable restricted session;

Alter system set JOB_QUEUE_PROCESSES = 0;

Alter system set AQ_TM_PROCESSES = 0;

Alter database open;

--- Assign a value to the CHARSET variable
Col value NEW_VALUE CHARSET
Select value from NLS_DATABASE_PARAMETERS where parameter = 'nls _ CHARACTERSET ';

Col value NEW_VALUE NCHARSET
Select value from NLS_DATABASE_PARAMETERS where parameter = 'nls _ NCHAR_CHARACTERSET ';

-- INTERNAL_USE is a parameter that is not written in the document and is used to force character set consistency.
Alter database character set INTERNAL_USE & CHARSET;
-- Alter database national character set INTERNAL_USE & NCHARSET;

-- Start the database again
Shutdown immediate;
STARTUP;


Set nls_lang = SIMPLIFIED CHINESE_CHINA.ZHS16GBK;
1. Pay attention to the database version when modifying dmp.
9i takes the following database and needs to modify the 1st and 2nd bytes of the dmp file in line 3rd
For databases larger than 10 Gb, You need to modify the 3rd and 3rd bytes of the dmp file in 4th rows.


This method is quite limited. For example, if your database is 10 TB, this method obviously won't work ....

2. The mandatory modification is the database character set.
Since December 10, csscan is recommended for oracle.

3. Add: This method is obviously not good for the overall migration of a database with a large amount of data. It is only used for a small number of databases or some tables with a small number of multibyte (such as Chinese)
Dblink + utl_raw

This is another method provided by Lunar_Zhang. Thanks for the method.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.