JDBC is used to read Oracle's US7ASCII Encoded chinese garbled characters and the bytes occupied by Chinese Characters in different encodings.

Source: Internet
Author: User

JDBC is used to read Oracle's US7ASCII Encoded chinese garbled characters and the bytes occupied by Chinese Characters in different encodings.

Database Version: Oracle 10g

Character Set: SIMPLIFIED CHINESE_CHINA.US7ASCII

JDK: 1.6.0 _ 45

Oracle DRIVER: ojdbc14.jar

It is no problem to use JDBC to operate the database and obtain the connection and execute SQL statements. However, all Chinese characters in the query results are garbled.

Debug shows that when the data is obtained from the database, it is garbled and PL/SQL and other tools are used.

I wonder if the Oracle driver uses the default character set when processing Chinese characters? At this time, in line with the concept of dead horse, the use of mandatory Transcoding of Chinese characters, ASCII code is a subset of the standard ISO-8859-1, maybe use this ISO-8859-1 can get normal Chinese characters? So, first test the use of new String (fieldValue. getBytes ("ISO-8859-1"); the output is still garbled! Think about transcoding, not just getting it in this format, so adjust it to: new String (fieldValue. getBytes ("ISO-8859-1"), "GBK"), test, get it done! However, it is a little troublesome for all places where Chinese characters are used.

It is said that another method can be used: to change the character set of Oracle!

1. Modify the Registry: HKEY_LOCAL_MACHINE \ SOFTWARE \ ORACLE \ HOME0 \ NLS_LANG. The value of HKEY_LOCAL_MACHINE \ SOFTWARE \ ORACLE \ HOME0 \ NLS_LANG is SIMPLIFIED CHINESE_CHINA.ZHS16GBK.

2. modify a system variable NLS_LANG.

The above two methods, the second method has been tried, you can. First, no tests were conducted in the current environment.

In addition, the obtained Chinese characters are written into another oracle 11g Database, and it is found that Chinese characters are not stored in one Chinese Character in two bytes, but one Chinese Character in three bytes!

After checking, we found that the character set is AL32UTF8. Generally, the default Chinese Character Set is three bytes. Therefore, we need to expand the field length of the target database table. For Chinese characters, oralce and the newer sqlserver both support nvarchar format. For nvarchar fields, whether they are Chinese characters, numbers, characters, or English letters, each character occupies one character.

If you are not sure how many characters a Chinese character currently occupies, you can use select length ('Hank') from dual; to view it.

Currently, the US7ASCII character set is rarely used by new systems. This may happen to some older legacy systems. Pay attention to the details!

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.