Explore Java garbled

Source: Internet
Author: User
Tags control characters

Many Java beginners have encountered garbled problem, there are a lot of people for this distress, the following to summarize.

Why encode?

First the computer will only recognize the binary code, but we usually use the characters (English, Chinese, other ...) )。 Therefore, this period involves the conversion process between the character and the byte. And this process (byte-by-character, character-by-byte) is the code and transcoding!

What are the encoding types? I've learned about computers.

ASCII encoding : It is the most widely used character set and encoding in the current computer. Represents 128 or 256 possible characters using the specified combination of 7-bit or 8-bit binary numbers. The standard ASCII code is also called the base ASCII code, using a 7-bit binary number to represent all uppercase and lowercase letters, numbers 0 through 9, punctuation, and special control characters used in American English.

iso8859-1: Can be expressed in Western European languages. Looks very single why still use, because is a single-byte encoding, and the computer is the most basic representation of the unit consistent, so many times, still use iso-8859-1 encoding to express. And on many protocols, the code is used by default. The problem is that Chinese encoding is not supported and can result in garbled characters.
GB2312: It is a two-byte encoding, the total encoding range is A1~e7, where a1~a9 is the symbol area, a total of 682 symbols, B0~f7 is the Chinese character area, contains 6,763 characters.

GBK: GBK code, is in the GB2312-80 standard based on the code extension specification, using a two-byte encoding scheme, its encoding range from 8140 to Fefe (excluding xx7f), a total of 23,940 code bits, a total of 21,003 Chinese characters, Fully compatible with the GB2312-80 standard, supports all CJK Kanji in the International standard ISO/IEC10646-1 and national standards Gb13000-1, and contains all the Chinese characters in BIG5 encoding.

UTF-16: UTF-16 is one of the ways Unicode is used. Unicode is the basis for Java and XML. UTF-16 specifically defines the conversion format for Unicode, which uses a fixed-length representation, which can be represented by two bytes, regardless of character. Two bytes is 16 bit, so called UTF-16. UTF-16 represents a very handy character, with every two bytes representing one character, which greatly simplifies the operation of the string, which is also an important reason for Java to UTF-16 as a character storage format for memory.

UTF-8: UTF-16 uniform uses two bytes to represent one character, although it is very simple in presentation! Convenient! But there are shortcomings. A large portion of the characters can be represented in a single byte, but now in two bytes, the storage is magnified one time! Network bandwidth has been used more than once. So there is the UTF-8, it is a variable length technology, each coding area has a different loadline length. Different characters can be made up of 1~6 bytes.

Of course there are some other codes that we won't be talking about in one by one.

The above four kinds of coding, can be processed in Chinese, and these four kinds of GBK and Utf-8 recommended for everyone, if you use other codes, it is likely to have some other problems.

What is the cause of garbled characters?

If the encoding format of the character is not the same as the encoding format to be displayed, the garbled problem will occur. For example, use GBK encoding and then decode using Utf-8. Usually when we develop, we often promote the development of environmental coding consistent. such as page encoding, database encoding, etc. In some Java video Tutorials , there is no need for database coding, so there will be problems, so be careful to set the code in the database.
Java stores characters in Unicode at run time, enabling a multi-lingual environment to be supported. We often use file reads during the development process. When Java reads a file, it uses the system's default encoding to decode the file. Therefore, reading a file with the FileInputStream class allows you to specify an encoded read.
JSP page garbled usually only at the beginning of the page with the following code to specify the character set encoding. If not, then use the following sentence to convert str=new String (str.getbytes ("iso-8859-1"), "page encoding Mode");
Java is used in network transmission of the encoding is "iso-8859-1", so in the output need to convert, such as:
String Str=new string (str.getbytes ("Development Environment Code"), "iso-8859-1");
After the network encoding of the Chinese, to correctly display on the page must be similar to the
Stirng str=new String (str.getbytes ("iso-8859-1"), "Development Environment Code"); Such a way to decode

Finally, in order to avoid unnecessary Java Chinese garbled , we must start from the source.
What do you mean? That is to say, in the project development, we must first let our development environment unified coding. For example, I use the development environment is eclipse+tomcat6.0+mysql. The project properties of Eclipse can be set to encode the item, which can be coded uniformly for the JSP page settings.

Hope that after reading this article will not be afraid of garbled problems, in fact, the problem is not scary, understand the principles of which can be solved.

Articles forwarded from the network

Explore Java garbled

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.