A preliminary discussion on character set problem (v)

Source: Internet
Author: User
Tags array character set command line log sql reserved oracle database sqlplus
Problem
Original link:


Http://www.eygle.com/special/NLS_CHARACTER_SET_05.htm


The original published in the Itpub Technology series "Oracle Database DBA topic Technology Pristine", without permission, is prohibited reproduced this article.

We know that in the export file, the character set ID that is used in the export is recorded, and by looking at the 2nd, 3 bytes of the exported file header, we can find the character set ID of the 16 binary representation, on Windows,
We can use tools such as UltraEdit to open the DMP file to view its exported character set::



On Unix We can view the following commands:

Cat Expdat.dmp | Od-x | Head



Oracle provides a standard function for converting character set names and IDs:



Sql> Select nls_charset_id (' ZHS16GBK ') from dual; nls_charset_id (' ZHS16GBK ')--------------------------8521 row selected. Sql> Select Nls_charset_name (852) from dual; Nls_char--------zhs16gbk1 row selected. Decimal conversion hexadecimal:sql> Select To_char (' 852 ', ' xxxx ') from dual; To_ch-----3541 Row selected.

corresponding to the 2nd and 3 bytes in the above figure, we know that the export file character set is ZHS16GBK.


The following script can be used for valid character sets in the query database:




Col nls_charset_id for 9999col nls_charset_name for A30col hex_id for A20select nls_charset_id (value) nls_charset_id, Valu E Nls_charset_name,to_char (nls_charset_id (value), ' xxxx ') hex_idfrom v$nls_valid_valueswhere parameter = ' CHARACTERSET ' ORDER by nls_charset_id (value)/

The output sample is as follows:



nls_charset_id Nls_charset_name hex_id
-------------- ------------------------------ -------------
1 Us7ascii 1
2 We8dec 2
3 WE8HP 3
4 us8pc437 4
5 we8ebcdic37 5
6 we8ebcdic500 6
7 we8ebcdic1140 7
8 we8ebcdic285 8
...................
850 zhs16cgb231280 352
851 zhs16maccgb231280 353
852 ZHS16GBK 354
853 Zhs16dbcs 355
860 Zht32euc 35c
861 Zht32sops 35d
862 ZHT16DBT 35e
863 Zht32tris 35f
864 Zht16dbcs 360
865 Zht16big5 361
866 ZHT16CCDC 362
867 zht16mswin950 363
868 Zht16hkscs 364
870 AL24UTFFSS 366
871 UTF8 367
872 UTFE 368

..................................




In many cases, when we import operations, have left the source database, then if the target database of the character set and export files inconsistent, often require special treatment,
The following several methods are introduced, mainly in Us7ascii and ZHS16GBK as examples

1. The source database character set is US7ASCII, the export file character set is Us7ascii or ZHS16GBK, and the target database character set is ZHS16GBK
In Oracle92, we found that the export file could not be imported into the Oracle9i database correctly, regardless of how it was handled, possibly because the oracle9i encoding scheme has changed significantly.

Here's a simple test we've done, where the export file naming rules are:


S-server, followed by the Server character set
C-client, followed by the client character set when the export operation


The client character set setting is done on the command line at import time, and we omit part of the test process.
For ORACLE9IR2, our test result is the US7ASCII character set, which, in any case, cannot be imported correctly into the database of the ZHS16GBK character set.

When an import operation is made, if the characters are not converted correctly, the Oracle database automatically uses a "?" Instead, that is, code 63.




 e:\nls2>set Nls_lang=american_america. Us7asciie:\nls2>imp eygle/eygle file=sus7ascii-cus7ascii.dmp fromuser=eygle touser=eygle tables=testImport: Release 9.2.0.4.0-production on Mon Nov 3 17:14:39 2003Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved. Connected to:oracle9i Enterprise Edition release 9.2.0.4.0-productionwith The partitioning, Oracle Label security, OLAP and Oracle Data Mining optionsjserver release 9.2.0.4.0-productionexport file created by export:v09.02.00 via Conventio NAL Pathimport do in Us7ascii character set and al16utf16 NCHAR character Setimport server uses ZHS16GBK character set ( Possible charset conversion). . Importing table "TEST" 2 rows Importedimport terminated successfully without warnings. E:\nls2>sqlplus eygle/eyglesql*plus:release 9.2.0.4.0-production on Mon Nov 3 17:14:50 2003Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved. Connected to:oracle9i Enterprise Edition release 9.2.0.4.0-productionwITH the partitioning, Oracle Label Security, OLAP and Oracle Data Mining optionsjserver release 9.2.0.4.0-productionsql& Gt Select Name,dump (name) from test; Namedump (NAME)-----------------------------???? Typ=1 len=4:63,63,63,63testtyp=1 len=4:116,101,115,1162 rows selected. Sql> exitdisconnected from oracle9i Enterprise Edition release 9.2.0.4.0-productionwith The partitioning, Oracle Labe L Security, OLAP and Oracle Data Mining optionsjserver release 9.2.0.4.0-productione:\nls2>set Nls_lang=american_amer Ica. Zhs16gbke:\nls2>imp eygle/eygle file=sus7ascii-cus7ascii.dmp fromuser=eygle touser=eygle tables=test Yimport:release 9.2.0.4.0-production on Mon Nov 3 17:15:28 2003Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved. Connected to:oracle9i Enterprise Edition release 9.2.0.4.0-productionwith The partitioning, Oracle Label security, OLAP and Oracle Data Mining optionsjserver release 9.2.0.4.0-productionexport file created by export:v09.02.00Via conventional pathimport done in ZHS16GBK character set and al16utf16 NCHAR character Setexport client uses Us7ascii CH Aracter Set (possible charset conversion). . Importing table "TEST" 2 rows Importedimport terminated successfully without warnings. E:\nls2>sqlplus eygle/eyglesql*plus:release 9.2.0.4.0-production on Mon Nov 3 17:15:34 2003Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved. Connected to:oracle9i Enterprise Edition release 9.2.0.4.0-productionwith The partitioning, Oracle Label security, OLAP and Oracle Data Mining optionsjserver release 9.2.0.4.0-productionsql> select Name,dump (name) from test; Namedump (NAME)--------------------------------------------------------------------------------???? Typ=1 len=4:63,63,63,63testtyp=1 len=4:116,101,115,116???? Typ=1 len=4:63,63,63,63testtyp=1 len=4:116,101,115,1164 rows selected. sql> drop table test; Table dropped. Sql> exitdisconnected from oracle9i Enterprise Edition release 9.2.0.4.0-productIonwith the partitioning, Oracle Label Security, OLAP and Oracle Data Mining optionsjserver release 9.2.0.4.0-production E:\nls2>set Nls_lang=american_america. Zhs16gbke:\nls2>imp eygle/eygle file=sus7ascii-czhs16gbk.dmp fromuser=eygle touser=eygle tables=test Yimport:release 9.2.0.4.0-production on Mon Nov 3 17:17:21 2003Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved. Connected to:oracle9i Enterprise Edition release 9.2.0.4.0-productionwith The partitioning, Oracle Label security, OLAP and Oracle Data Mining optionsjserver release 9.2.0.4.0-productionexport file created by export:v09.02.00 via Conventio NAL Pathimport do in ZHS16GBK character set and Al16utf16 NCHAR set. . Importing table "TEST" 2 rows Importedimport terminated successfully without warnings. E:\nls2>sqlplus eygle/eyglesql*plus:release 9.2.0.4.0-production on Mon Nov 3 17:17:30 2003Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved. Connected To:oraclE9i Enterprise Edition Release 9.2.0.4.0-productionwith the partitioning, Oracle Label Security, OLAP and Oracle Data Mi Ning optionsjserver Release 9.2.0.4.0-productionsql> select Name,dump (name) from test; Namedump (NAME)----------------------------------------------???? Typ=1 len=4:63,63,63,63testtyp=1 len=4:116,101,115,1162 rows selected. Sql> exitdisconnected from oracle9i Enterprise Edition release 9.2.0.4.0-productionwith The partitioning, Oracle Labe L Security, OLAP and Oracle Data Mining optionsjserver release 9.2.0.4.0-productione:\nls2>set Nls_lang=american_amer Ica. Us7asciie:\nls2>imp eygle/eygle file=sus7ascii-czhs16gbk.dmp fromuser=eygle touser=eygle tables=test Yimport:release 9.2.0.4.0-production on Mon Nov 3 17:18:00 2003Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved. Connected to:oracle9i Enterprise Edition release 9.2.0.4.0-productionwith The partitioning, Oracle Label security, OLAP and Oracle Data Mining optionsjserver Release 9.2.0.4.0-productionexport file created by export:v09.02.00 via conventional pathimport do in Us7ascii ter set and al16utf16 NCHAR character Setimport server uses ZHS16GBK character set (possible charset conversion) export CLI ENT uses ZHS16GBK character set (Possible charset). . Importing table "TEST" 2 rows Importedimport terminated successfully without warnings. E:\nls2>sqlplus eygle/eyglesql*plus:release 9.2.0.4.0-production on Mon Nov 3 17:18:08 2003Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved. Connected to:oracle9i Enterprise Edition release 9.2.0.4.0-productionwith The partitioning, Oracle Label security, OLAP and Oracle Data Mining optionsjserver release 9.2.0.4.0-productionsql> select Name,dump (name) from test; Namedump (NAME)----------------------------------------???? Typ=1 len=4:63,63,63,63testtyp=1 len=4:116,101,115,116???? Typ=1 len=4:63,63,63,63testtyp=1 len=4:116,101,115,1164 rows selected. Sql>

For this situation, we can set the export character set to Us7ascii by using the Oracle8i Export tool, and then modify the second to third character after the export, and modify 0001 to
0354, so that the data from the US7ASCII character set can be imported correctly into the ZHS16GBK database.

To modify an export file:




To import a modified export file:



 e:\nls2>set Nls_lang=american_america. Zhs16gbke:\nls2>imp eygle/eygle file=sus7ascii-cus7ascii-exp817.dmp fromuser=eygle touser=eygle tables= Testimport:release 9.2.0.4.0-production on Mon Nov 3 17:37:17 2003Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved. Connected to:oracle9i Enterprise Edition release 9.2.0.4.0-productionwith The partitioning, Oracle Label security, OLAP and Oracle Data Mining optionsjserver release 9.2.0.4.0-productionexport file created by export:v08.01.07 via Conventio NAL Pathimport do in ZHS16GBK character set and al16utf16 NCHAR character Setexport server uses UTF8 NCHAR character set (Possible ncharset conversion). . Importing table "TEST" 2 rows Importedimport terminated successfully without warnings. E:\nls2>sqlplus eygle/eyglesql*plus:release 9.2.0.4.0-production on Mon Nov 3 17:37:23 2003Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved. Connected to:oracle9i Enterprise Edition release 9.2.0.4.0-pRoductionwith the partitioning, Oracle Label Security, OLAP and Oracle Data Mining optionsjserver release 9.2.0.4.0-prod Uctionsql> Select Name,dump (name) from Test;name dump (name)----------------------------------------------------- ---------------------------Test typ=1 len=4:178,226,202,212test typ=1 len=4:116,101,115,1162 rows selected. Sql>

2. How to use CREATE database
If the character set used by the export file is Us7ascii and the character set of the target database is ZHS16GBK, we can modify it using the CREATE database method as follows:




 SQL> col parameter for a30sql> col value for a30sql> select * from V$nls_parameters; PARAMETER VALUE------------------------------------------------------------nls_language americannls_territory Americanls_currency $NLS _iso_currency americanls_numeric_characters., Nls_calendar Gregoriannls_date_format Dd-mon-rrnls_date_language americannls_characterset zhs16gbknls_sort binary...................19 rows selected. Sql> CREATE DATABASE character set Us7ascii;create database character set us7ascii*error at line 1:ora-01031:insuffici ENT privilegessql> SELECT * from V$nls_parameters; PARAMETER VALUE------------------------------------------------------------nls_language americannls_territory Americanls_currency $NLS _iso_currency americanls_numeric_characters., Nls_calendar Gregoriannls_date_format Dd-mon-rrnls_date_language americannls_characterset us7asciinls_sort binary..............19 rows selected. Sql> exitdisconnected from oracle9i Enterprise Edition release 9.2.0.4.0-productionwitH The partitioning, Oracle Label Security, OLAP and Oracle Data Mining optionsjserver release 9.2.0.4.0-productione:\nls 2>set Nls_lang=american_america. Us7asciie:\nls2>imp eygle/eygle file=sus7ascii-cus7ascii.dmp fromuser=eygle touser=eygleimport:release 9.2.0.4.0 -Production on Sun Nov 2 14:53:26 2003Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved. Connected to:oracle9i Enterprise Edition release 9.2.0.4.0-productionwith The partitioning, Oracle Label security, OLAP and Oracle Data Mining optionsjserver release 9.2.0.4.0-productionexport file created by export:v09.02.00 via Conventio NAL Pathimport do in Us7ascii character set and al16utf16 NCHAR character Setimport server uses ZHS16GBK character set ( Possible charset conversion). . Importing table "TEST" 2 rows Importedimport terminated successfully without warnings. E:\nls2>sqlplus eygle/eyglesql*plus:release 9.2.0.4.0-production on Sun Nov 2 14:53:35 2003Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved. Connected to:oracle9i Enterprise Edition release 9.2.0.4.0-productionwith The partitioning, Oracle Label security, OLAP and Oracle Data Mining optionsjserver release 9.2.0.4.0-productionsql> select * from test; NAME----------Test test2 rows selected.

We see that when the CREATE database character Set US7ASCII command is issued, the set of character sets in the database v$nls_parameters changes, which affects the import process.
The data can be imported correctly after the change, and the setting is restored after the database has been reset.

Tip: V$nls_paraemters from X$nls_parameters, the dynamic performance view affects the import operation, and nls_database_parameters from the props$ data table, affecting the data store.

3. Oracle-provided character scanning Tool Csscan

We say that the above method should only be used in the last resort, its essence is to deceive the database, forcing the import of data, possibly loss of metadata.
If you want to ensure the integrity of your data, you should use the Csscan Scan database, find all the incompatible characters, and then, by writing the appropriate scripts and code, update after the conversion to ensure that the data is correct.
Let's take a quick look at the use of Csscan.

Before you use Csscan, you need to create a corresponding data dictionary object as a sys user:




E:\nls2>sqlplus "/As SYSDBA" Sql*plus:release 9.2.0.4.0-production on Sun Nov 2 19:42:07 2003Copyright (c) 1982, 200 2, Oracle Corporation. All rights reserved. Connected to:oracle9i Enterprise Edition release 9.2.0.4.0-productionwith The partitioning, Oracle Label security, OLAP and Oracle Data Mining optionsjserver release 9.2.0.4.0-productionsql> select instance_name from V$intance;select ins Tance_name from V$intance *error in line 1:ora-00942:table or view does not existsql> select instance_name from V$inst Ance;instance_name----------------penny1 Row selected. Sql> @?/rdbms/admin/csminst.sqluser created. Grant succeeded .......

This script creates the corresponding user (CSMIG) and the data Dictionary object, and the scanned information is recorded in the corresponding data dictionary.

We can call this tool on the command line to scan the database:

 e:\nls2>csscan full=y fromchar=zhs16gbk tochar=us7ascii log=us7check.log CAPTURE=Y ARRAY=1000000 PROCESS= 2Character Set Scanner v1.1:release 9.2.0.1.0-production on Sun Nov 2 20:24:45 2003Copyright (c) 1982, 2002, Oracle Co Rporation. All rights reserved. username:eygle/eygleconnected to:oracle9i Enterprise Edition release 9.2.0.4.0-productionwith the partitioning, Oracle Label security, OLAP and Oracle Data Mining optionsjserver release 9.2.0.4.0-productionenumerating tables to scan ... p Rocess 1 scanning SYS. SOURCE$[AAAABHAABAAAAIRAAA]. Process 2 scanning SYS. ATTRIBUTE$[AAAAEOAABAAAAHZAAA]. Process 2 scanning SYS. PARAMETER$[AAAAEOAABAAAAHZAAA]. Process 2 scanning SYS. METHOD$[AAAAEOAABAAAAHZAAA] ... process 2 scanning SYSTEM. DEF$_AQERROR[AAAA8FAABAAACWJAAA]. Process 1 Scanning Wmsys. WM$ENV_VARS[AAABEWAABAAAFMZAAA] ......... ........ Process 2 scanning SYS. UGROUP$[AAAAA5AABAAAAGPAAA]. Process 2 scanning SYS. CON$[AAAAACAABAAAACPAAA]. Process 1 Scanning SYS. File$[aaaaaraabaaaabxaaa]creating Database Scan Summary ... Creating individual Exception ... Scanner terminated successfully.

We can then check the output log to see the database scan:

Database Scan individual Exception report[database Scan parameters]parameter Value----------------------------------- -------------------------------------------Scan Type full database Scan CHAR data? YES Current database Character set ZHS16GBK New database character Set us7ascii Scan NCHAR data? NO Array Fetch buffer size 1000000 number of processes 2 Capture convertible data? YES------------------------------------------------------------------------------[Data Dictionary Individual Exceptions][application Data Individual exceptions]user:eygletable:testcolumn:nametype:varchar2 (a) Number of Except Ions:1 Max Post Conversion data size:4 ROWID Exception Type Size Cell data (A/bytes)--------------------------- --------------------------------------------aaabpiaadaaaaamaaa lossy conversion test------------------------------- ----- ----- ------------------------------



Data that cannot be converted will be recorded and we can update the data after the conversion to ensure that the conversion is correct.




Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.