Merlin's Magic: Character set

Source: Internet
Author: User
Tags character set

To represent with a number

Not afraid to mention, in fact, the computer only understand the numbers. But the following may not be so obvious--because computers only understand numbers, they need to map numeric values to corresponding characters in some form, so that text can be displayed. This is the mapping (or character set) that allows the computer to understand the text. For example, for this mapping, the early desktops used ASCII. When a computer that uses ASCII stores numbers 72, 101, 108, and 112, it knows that the word "help" is displayed, because in ASCII, the number 72 is the value of H, 101 is the value of E, 108 is the value of L, and 112 is the value of P. But if this computer is an early IBM mainframe (it uses EBCDIC instead of ASCII), the word "help" will be represented by numbers 200, 133, 147, and 151.

Basic knowledge of character sets

When migrating to the Java language, there are three classes in the Java.nio.charset package that help with this mapping: CharSet, Charsetencoder, and Charsetdecoder. These classes work together so that you can take a mapping and then convert it to another mapping. When converting from another mapping to Java mapping (Unicode), you can use a decoder (decoder). Then, if you need to convert from Java Mapping (Unicode) to another mapping (or back to the original mapping), you can use the Encoder (encoder). You cannot convert directly between two non-Unicode formats with the Java.nio.charset package, but you can convert between the two non-Unicode formats in an intermediate Unicode format.

Before you can get a decoder or encoder, you need to get the Charset for a particular mapping. For example, Us-ascii is the name of the mapping for the 7-bit ASCII character set. You simply pass the name to the Charset forname () method as follows:

Charset charset =
  Charset.forName("US-ASCII");

Once you have the Charset, simply request Charsetdecoder and Charsetencoder as follows:

CharsetDecoder decoder =
  charset.newDecoder();
CharsetEncoder encoder =
  charset.newEncoder();

With decoders and encoders, you can convert between different character sets, as follows:

ByteBuffer bytes = ...;
  CharBuffer chars = decoder.decode(bytes);
  bytes = encoder.encode(chars);

Of course, if you are unsure which character sets are available, you will need to use the following statement to ask:

SortedMap map =
  Charset.availableCharsets();

You will then use a specific decoder to convert the external bytes to internal characters. Then, if you need to send the data to the Java code, you will use the encoder to convert the internal characters to external bytes. As to which specific character sets are available, your runtime will determine the entire character set. However, each Java programming implementation must support the following encodings:

Us-ascii:7 bit ASCII

Iso-8859-1:iso Latin alphabet

Utf-8:8 bit UCS conversion format

Utf-16be:16 bit UCS conversion format, large Mantissa method byte order

Utf-16le:16 bit UCS conversion format, small Mantissa method byte order

Utf-16:16 bit UCS conversion format, byte order identified by Mark (marker)

Then, different platforms may support additional character sets specific to the platform (for example, on a Windows platform, you will find that it supports the Windows-1252 character set). If you need to support other character sets, you can create your own character set. See the Charsetprovider API in the JAVA.NIO.CHARSET.SPI package.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.