Some tips on character encoding in Java may help beginners a bit

Source: Internet
Author: User
Tags array modify readline
Coding | learning | Experience this is a topic in the Zhang Xiaoxiang: Teacher's Java Employment Training Video Tutorial (change):
Write the following program code to analyze and observe the results of the program's operation:

Import java.io.*;
public class Testcodeio {
public static void Main (string[] args) throws exception{
InputStreamReader ISR = new InputStreamReader (system.in, "iso8859-1");
BufferedReader br = new BufferedReader (ISR);
String strLine = Br.readline ();
Br.close ();
Isr.close ();
System.out.println (StrLine);
}
}
After running the program, enter "China" two words, the output result is??? Ú
Please follow the following two methods to modify the above program, is the input of Chinese can be normal output
1. To modify a statement in a program
InputStreamReader ISR = new InputStreamReader (system.in, "iso8859-1");
2. Do not modify the above statement, modify the following statement
System.out.println (StrLine);


The first method is very simple, just change it to the following, which is not discussed in detail here.
InputStreamReader ISR = new InputStreamReader (system.in, "gb2312");


Here I would like to discuss in detail the second method of how to change

That's how I changed it at first.
System.out.println (New String (Strline.getbytes (), "iso8859-1"));
Input "China" after the output of the results, although not the above described garbled, but still garbled, obviously this method is not correct!

Here I would like to thank the software workers to tell me the correct way to change, so I suddenly understand
System.out.println (New String (Strline.getbytes ("iso8859-1")));

What is the difference between these two methods of change? In order to facilitate the reading, I first put the correct and wrong change Fateh out:
Import java.io.*;
public class Testcodeio {
public static void Main (string[] args) throws exception{
InputStreamReader ISR = new InputStreamReader (system.in, "iso8859-1");
Create a inputstreamreader that uses the given CharSet decoder
BufferedReader br = new BufferedReader (ISR);
String strLine = Br.readline ();
Br.close ();
Isr.close ();
System.out.println (StrLine);
System.out.println (New String (Strline.getbytes (), "iso8859-1"))/Error Change method
Encodes this String (strLine) into a sequence of bytes using the platforms
Default CharSet (gb2312) then constructs a new String by decoding the
Specified array of bytes using the specified charset (iso8859-1)
Because this String (StrLine) uses the charset decoder "iso8859-1" and so it can
Only is encoded by "iso8859-1", CANNT is encoded by the platforms default
CharSet "gb2312" is wrong.
System.out.println (New String (Strline.getbytes ("iso8859-1"))/correct method
Encodes this String (strLine) into a sequence of bytes using the named
CharSet (iso8859-1), then constructs a new String by decoding the
Specified array of bytes using the platforms default CharSet (gb2312).
This are right.
}
}

The English note above has made it very clear, here I will explain it:

The first is the wrong method of System.out.println (New String (Strline.getbytes (), "iso8859-1"));
This code is the default encoding of the strings in strline (this is gb2312)
Converts to a sequence of bytes and then constructs a new one with the specified encoding (here is iso8859-1)
String object and prints it to the screen.
Where is the mistake?
Please note that this piece of code
InputStreamReader ISR = new InputStreamReader (system.in, "iso8859-1");
BufferedReader br = new BufferedReader (ISR);
String strLine = Br.readline ();
Here strline stored content is stored in the specified encoding (ISO8859-1) and converted to bytecode
(This Code strline.getbytes ()) uses the system default gb2312 encoding, so of course
The output is garbled! The new string object is then constructed using the GB2312 encoded byte sequence, and the
ISO8859-1 encoding, so the output of garbled and System.out.println (StrLine) is different.


As for the correct method, do not need to elaborate on it, first of all, strline by iso8859-1 encoding to bytes
Sequence and then constructs a new string object with the system default encoding (GB2312), and then prints the output.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.