Java fill Trap-the concept of characters and bytes and their differences

Source: Internet
Author: User

First, let's take a look at this question: "Howmany bytes is the memory space in the Java language that the string" java "occupies? "To answer this question we must first understand what is" byte "and what is" character ".

byte: bytes are units that transmit information over the network (or store information in hard disks or in memory). Bytes are a unit of measurement used by computer information technology to measure storage capacity and transmission capacity, and1 bytes equals 8-bit binary, which is a

A 8-bit binary number, is a very specific storage space.

Characters: Symbols used by people, a symbol in the abstract sense. ' 1 ', ' Medium ', ' a ', ' $ ', ' ¥', ...

When it comes to characters, we have to mention ANSI and Unicode two different coding standards (for these two coding standards here I simply mention that if you are interested to check it yourself),the characters in ANSI are 8bit, and

Characters in Unicode are used in 16bit. (for characters that say ANSI holds English characters in single-byte, double-byte for Chinese, and Unicode, both English and Chinese characters are stored in double-byte)Unicode code is also an international standard

quasi-coded, with two-byte encoding, with the Span lang= "en-us" xml:lang= "en-US" >ansi code is not compatible.      ansi rules: a less than 127 characters of the same meaning as originally, but two more than 127 Word connect prompt together, Represents a Chinese character, preceded by a byte (which he calls

high byte" from 0xa1 for   0xf7, followed by a byte (low byte) from 0xa1 to 0xfe, so that we can assemble about

names are all in the  ascii  in the original number, punctuation, letters are all re-compiled two bytes long encoding, this is often said "full-width " characters, while the original 127th the following are called Span lang= "en-us" xml:lang= "en" "to" half-width "characters.

unicode , whether it is a half-width of the English alphabet, or full-width Chinese characters, they are unified Span lang= "en-us" xml:lang= "en-us" > "one character "! At the same time, it is also unified ".

We can simply take a conclusion: according to ANSI coding standard, punctuation, numbers, uppercase and lowercase letters accounted for one byte, Chinese characters accounted for 2 bytes. all characters in the Unicode standard account for 2 bytes.

Let's look at the string, because there are 2 encoding standards for characters, so the string is divided into 2 types.

String (ANSI): In memory, if the "character " is in ANSI encoded form, one character may be represented by one byte or more bytes, then we call this string an ANSI string or a multibyte string.

String (UNICODE): In memory, if "character " exists in Unicode, then we call this string a Unicode string or a wide-byte string.

Since the standards set by different ANSI encodings are not the same, for a given multibyte string, we must know which encoding rule it uses to know what "characters" it contains . and to

In the case of a UNICODE string, the "character" content it represents is always the same, regardless of the environment .

As a result, the problem we raised above is solved because the characters in Java are encoded in Unicode, so the "Learn Java" string takes up 10 bytes in the Java language .

Java fill Trap-the concept of characters and bytes and their differences

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.