Explanation of Java character encoding problems (storing files in ANSI format)

Source: Internet
Author: User
Interpretation of coding problems:
Basic concepts:
1, ANSI coding methods, including GBK, GB2312
2, UTF-8 encoding mode iso-10646-1
3, IE browser normal (default) in the case only to resolve to the ANSI storage of Chinese, otherwise garbled. ANSI (GB2312) is the fastest way to parse more Web pages in Chinese.
4, if the use of FileWriter class output, using the encoding (UFT-8) (winxp,win7) output.
5, if the use of OutputStreamWriter ((new FileOutputStream (f)), "GB2312" way to force the output to ANSI encoding.
The following is the solution code for the Chinese garbled problem in HTML.
-----------------------------------------------------------------------------------------------
Explain:
The main difference between byte streams and characters is their way of handling
Byte throttling is the most basic, all inputstream and outputstream subclasses are mainly used in processing binary data, which is processed in bytes
But in fact many data is the text, also proposed the concept of the character stream, it is according to the virtual machine's encode to deal with, is to carry on the character set transformation
The two are associated through Inputstreamreader,outputstreamwriter, actually by byte[] and string.
The problems of Chinese characters in the actual development are actually caused by the transformation between the character stream and the word throttling.
When converting from byte to character stream, which is actually byte[] into a string,
public string (byte bytes[], string charsetname)
There is a key parameter character set encoding, which is usually omitted, and the system uses the OS Lang
When a character is converted to a byte stream, it is actually a string that converts to byte[]
Byte[] String.getbytes (String charsetname)
It's the same thing.
As for the java.io, there are many other flows, which are mainly designed to improve performance and ease of use,
such as Bufferedinputstream,pipedinputstream, etc.
------------------------------------------------------------------------------------------------
Import Java.io.File;
Import Java.io.FileOutputStream;
Import Java.io.OutputStreamWriter;


public class Charsettest {//forced to store files in ANSI mode
public static void Main (string[] args) throws Exception {
File F=new file ("c:\\f1.html");
String str= "OutputStreamWriter osw=new OutputStreamWriter (new FileOutputStream (f)), "GB2312");
OutputStreamWriter is the bridge of character flow to byte stream
FileOutputStream this abstract class is a superclass of all classes that represent the output byte stream.
The output stream accepts output bytes and sends those bytes to a sink.
Osw.write (str);
Osw.flush ();
Osw.close ();
}
}

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.