A preliminary discussion on character set problem (Iv.)

Source: Internet
Author: User
Tags character set reserved client sqlplus
Problem
Link

Http://www.eygle.com/special/NLS_CHARACTER_SET_04.htm

4. Import Export and transformation



Import and export is a commonly used data migration and transformation tool, because its export file is platform-independent, so it is most commonly used in cross platform migration.
In the export operation, it is very important that the client's character set settings, that is, the client's Nls_lang settings.
The Nls_lang parameter consists of the following sections:







Nls_lang=<language>_<territory>.<clients characterset>





Nls_lang the meanings of each part are as follows: language specifies the language used by the:-O racle message-The month and day of the date display territory specify-currency and number formats-regions and the habit of calculating weeks and dates characterset:- Controlling the character set used by client applications is usually set either to the client (such as Windows) code page or to UTF8 the code page for the current system on Windows can use the CHCP command:

E:\>chcp
Active code page: 936


Code page 936, which is the Chinese character set GBK, at Microsoft's official site, we can get specific coding rules about the 936 code page, please refer to the following links:


Http://www.microsoft.com/globaldev/reference/dbcs/936.htm

Let's look at a simple test to see how these parameters work:




E:\>set nls_lang=simplified Chinese_china. Zhs16gbke:\>sqlplus "/As SYSDBA" Sql*plus:release 9.2.0.4.0-production on Saturday November 1 22:51:59 2003Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved. Connect to: Oracle9i Enterprise Edition release 9.2.0.4.0-productionwith The partitioning, Oracle Label SECU Rity, OLAP and Oracle Data Mining optionsjserver release 9.2.0.4.0-productionsql> select Sysdate from dual; Sysdate----------January-November-03 has selected 1 rows. Sql> exit from Oracle9i Enterprise Edition release 9.2.0.4.0-productionwith The partitioning, Oracle Label security, OLAP and Oracle Data Mining optionsjserver release 9.2.0.4.0-production break open E:\>set Nls_lang=american_america. Zhs16gbke:\>sqlplus "/As SYSDBA" Sql*plus:release 9.2.0.4.0-production on Sat Nov 1 22:52:24 2003Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved. Connected to:oracle9i Enterprise Edition release 9.2.0.4.0-productionwith The partitioning, Oracle Label security, OLAP and Oracle Data Mining optionsjserver Release 9.2.0.4.0-productionsql> select Sysdate from dual; Sysdate---------01-nov-031 Row selected. Sql>


To view client-side Nls_lang settings, you can use the following methods:

Windows use: Echo%nls_lang% such as: E:\>echo%nls_lang%american_america. Zhs16gbkunix use: Env|grep Nls_lang such as:/opt/oracle>env|grep Nls_langnls_lang=american_china. Zhs16gbkwindows client settings, you can change Nls_lang in the registry, where the specific key value is: HKEY_LOCAL_MACHINE OFTWARE\ORACLE\HOMEXX\XX refers to the system number when multiple oracle_home are present.

Import and export are client products, like Sql*plus and Oralce forms, so using the Exp/imp tool converts character sets in the way defined by Nls_lang.

The character set used by the export is recorded in the export file, and when the file is imported, the character set that is used when exporting is checked, if the character set is different from the Nls_lang of the imported client
Settings, the character set is converted according to the import client Nls_lang settings, and if necessary, further conversion occurs before the data is inserted into the database.

It is usually best to set the client character set to be the same as the database side when exporting, which avoids unnecessary data transformations when exporting, and the export file will have the same character set as the database.
Even if you import the exported file into a database in a different character set in the future, you can defer the conversion to the import time.

When you import data, there are two main scenarios:
1. The source and target databases have the same character set settings
At this point, you only need to set Nls_lang equal to the database character set to import (provided that the export uses the same character set as the source database, that is, the same three)

2. Source database and Target database character set are different
If we use the same nls_lang as the source database when we export, you can set the client Nls_lang equal to the character set used when exporting, which
The sample transformation only occurs on the database side and occurs only once.

For example:
If you are converting from we8mswin1252 to UTF8
1) using Nls_lang=american_america. we8mswin1252 Export the database.
The export file you created contains the we8mswin1252 data
2 use Nls_lang=american_america when importing. we8mswin1252
The conversion occurs only in the database where the insert data is UTF8.

The above assumptions can only be converted if the target database character set is a superset of the source database character set. If different, usually need to do some special processing.

Let's take a quick look at the import conversion process (oracle8i, for example):

1. Determine the export database character set environment
You can get the character set settings for the exported file by reading the export file header
2. Determine the character set to import session, that is, to import the Nls_lang environment variables used by the session
3.IMP Read Export File
Read the export file character set ID and compare the Nls_lang of the import process
4. If the export file character set and the import session character set are the same, you do not need to convert in this step
If it is different, you need to convert the data to the character set used for the import session.
However, this conversion can only occur between the single byte character set.
Let's look at a test:




E:\nls2>set Nls_lang=american_america. Us7ascii Settings Import session Nls_lang to Us7asciie:\nls2>e:\oracle\ora8i\bin\imp Eygle/eygle file= Sus7ascii-cus7ascii-exp817.dmp fromuser=eygle touser=eygle tables=test This export file is exported from the US7ASCII database, exporting the client Nls_ Lang is also us7asciiimport:release 8.1.7.1.1-production on Fri Nov 7 00:59:22 2003 (c) to Copyright Oracle Corporation. All rights reserved. Connected to:oracle8i Enterprise Edition release 8.1.7.1.1-productionwith the partitioning optionjserver release 8.1.7. 1.1-production is imported at this time, no character set conversion is required between DMP file and Nls_lang. Export file created by export:v08.01.07 via conventional pathimport do in Us7ascii character set and ZHS16GBK Char Acter Setimport Server uses ZHS16GBK character set (possible charset conversion) export server uses UTF8 NCHAR character SE T (possible ncharset conversion). . Importing table "TEST" 2 rows Importedimport terminated successfully without warnings.

5. For imports of multiple byte character sets (e.g., UTF8)
Need to set import session character set and export character set same
Otherwise, you will encounter: IMP-16 "Required Character set conversion (type%lu to%lu) not supported" error.
:

E:\nls2>set Nls_lang=american_america. ZHS16GBK Import session Character set to ZHS16GBK import us7ascii export file E:\nls2>e:\oracle\ora8i\bin\imp eygle/eygle file= Sus7ascii-cus7ascii-exp817.dmp fromuser=eygle touser=eygleimport:release 8.1.7.1.1-production on Fri Nov 7 00:38:55 200 3 (c) Copyright to Oracle Corporation. All rights reserved. Connected to:oracle8i Enterprise Edition release 8.1.7.1.1-productionwith the partitioning optionjserver release 8.1.7. 1.1-productionimp-00016:required Character Set conversion (Type 1 to 852) not Supportedimp-00000:import terminated uns Uccessfully does not support a single byte character set to multiple byte conversions from the export file Us7ascii to the import Nls_lang set to ZHS16GBK, reporting the above error.

6. The import session character set should be the super of the exported character set, otherwise, the special characters will be difficult to convert correctly.
7. When the data is converted to the import session character set, if the import session character set differs from the imported database character set, then the final conversion is required, which requires that the database characters be imported
Sets are super for importing the session character set, otherwise some proprietary characters will not be converted correctly.
We continue to look at the above two processes, here are two principles:
1. If the Nls_lang settings are the same as the database, the data (of course, 2 in transit) is inserted directly into the database without conversion.
2. If the Nls_lang settings are different from the database, then the data needs to be converted before it can be inserted into the database.
Let's look at the first example above:
:

Export file created by export:v08.01.07 via conventional pathimport do in Us7ascii character set and ZHS16GBK Char Acter Setimport Server uses ZHS16GBK character set (possible charset conversion) export server uses UTF8 NCHAR character SE T (possible ncharset conversion). . Importing table "TEST" 2 rows Importedimport terminated successfully without Warnings. This time after the first step of the conversion of the data, Us7ascii to ZHS16GBK lost first, as is inserted into the database, we see that the database is stored in the wrong characters (in the back
Section we made a detailed conversion): E:\nls2>sqlplus eygle/eyglesql*plus:release 9.2.0.4.0-production on Fri Nov 7 00:35:39 2003Copyright (c ) 1982, 2002, Oracle Corporation. All rights reserved. Connected to:oracle8i Enterprise Edition release 8.1.7.1.1-productionwith the partitioning optionjserver release 8.1.7.1 .1-productionsql> select * from test; NAME--------------------2bJTtest

In Oracle9i, the above situation is slightly different.




Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.