JDBC reading of Oracle US7ASCII encoding Chinese garbled characters and the use of Chinese Characters in different encodings
Database Version: Oracle 10g
Character Set: SIMPLIFIED CHINESE_CHINA.US7ASCII
JDK: 1.6.0 _ 45
Oracle DRIVER: ojdbc14.jar
It is no problem to use JDBC to operate the database and obtain the connection and execute SQL statements. However, all Chinese characters in the query results are garbled.
Debug shows that when the data is obtained from the database, it is garbled and PL/SQL and other tools are used.
I wonder if the Oracle driver uses the default character set when processing Chinese characters? At this time, in line with the concept of dead horse, the use of mandatory Transcoding of Chinese characters, ASCII code is a subset of the standard ISO-8859-1, maybe use this ISO-8859-1 can get normal Chinese characters? So, first test the use of new String (fieldValue. getBytes ("ISO-8859-1"); the output is still garbled! Think about transcoding, not just getting it in this format, so adjust it to: new String (fieldValue. getBytes ("ISO-8859-1"), "GBK"), test, get it done! However, it is a little troublesome for all places where Chinese characters are used.
It is said that another method can be used: to change the character set of Oracle!
1. Modify the Registry: HKEY_LOCAL_MACHINE \ SOFTWARE \ ORACLE \ HOME0 \ NLS_LANG. The value of HKEY_LOCAL_MACHINE \ SOFTWARE \ ORACLE \ HOME0 \ NLS_LANG is SIMPLIFIED CHINESE_CHINA.ZHS16GBK.
2. modify a system variable NLS_LANG.
The above two methods, the second method has been tried, you can. First, no tests were conducted in the current environment.
In addition, the obtained Chinese characters are written into another oracle 11g Database, and it is found that Chinese characters are not stored in one Chinese Character in two bytes, but one Chinese Character in three bytes!
After checking, we found that the character set is AL32UTF8. Generally, the default Chinese Character Set is three bytes. Therefore, we need to expand the field length of the target database table. For Chinese characters, oralce and the newer sqlserver both support nvarchar format. For nvarchar fields, whether they are Chinese characters, numbers, characters, or English letters, each character occupies one character.
If you are not sure how many characters a Chinese character currently occupies, you can use select length ('Hank') from dual; to view it.
Currently, the US7ASCII character set is rarely used by new systems. This may happen to some older legacy systems. Pay attention to the details!