Merlin's Magic: Character set

Last Update:2017-02-27 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

To represent with a number

Not afraid to mention, in fact, the computer only understand the numbers. But the following may not be so obvious--because computers only understand numbers, they need to map numeric values to corresponding characters in some form, so that text can be displayed. This is the mapping (or character set) that allows the computer to understand the text. For example, for this mapping, the early desktops used ASCII. When a computer that uses ASCII stores numbers 72, 101, 108, and 112, it knows that the word "help" is displayed, because in ASCII, the number 72 is the value of H, 101 is the value of E, 108 is the value of L, and 112 is the value of P. But if this computer is an early IBM mainframe (it uses EBCDIC instead of ASCII), the word "help" will be represented by numbers 200, 133, 147, and 151.

Basic knowledge of character sets

When migrating to the Java language, there are three classes in the Java.nio.charset package that help with this mapping: CharSet, Charsetencoder, and Charsetdecoder. These classes work together so that you can take a mapping and then convert it to another mapping. When converting from another mapping to Java mapping (Unicode), you can use a decoder (decoder). Then, if you need to convert from Java Mapping (Unicode) to another mapping (or back to the original mapping), you can use the Encoder (encoder). You cannot convert directly between two non-Unicode formats with the Java.nio.charset package, but you can convert between the two non-Unicode formats in an intermediate Unicode format.

Before you can get a decoder or encoder, you need to get the Charset for a particular mapping. For example, Us-ascii is the name of the mapping for the 7-bit ASCII character set. You simply pass the name to the Charset forname () method as follows:

Charset charset = 　 Charset.forName("US-ASCII");

Once you have the Charset, simply request Charsetdecoder and Charsetencoder as follows:

CharsetDecoder decoder = 　 charset.newDecoder(); CharsetEncoder encoder = 　 charset.newEncoder();

With decoders and encoders, you can convert between different character sets, as follows:

ByteBuffer bytes = ...; 　 CharBuffer chars = decoder.decode(bytes); 　 bytes = encoder.encode(chars);

Of course, if you are unsure which character sets are available, you will need to use the following statement to ask:

SortedMap map = 　 Charset.availableCharsets();

You will then use a specific decoder to convert the external bytes to internal characters. Then, if you need to send the data to the Java code, you will use the encoder to convert the internal characters to external bytes. As to which specific character sets are available, your runtime will determine the entire character set. However, each Java programming implementation must support the following encodings:

Us-ascii:7 bit ASCII

Iso-8859-1:iso Latin alphabet

Utf-8:8 bit UCS conversion format

Utf-16be:16 bit UCS conversion format, large Mantissa method byte order

Utf-16le:16 bit UCS conversion format, small Mantissa method byte order

Utf-16:16 bit UCS conversion format, byte order identified by Mark (marker)

Then, different platforms may support additional character sets specific to the platform (for example, on a Windows platform, you will find that it supports the Windows-1252 character set). If you need to support other character sets, you can create your own character set. See the Charsetprovider API in the JAVA.NIO.CHARSET.SPI package.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Merlin's Magic: Character set

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support