Unicode and. NET

Source: Internet
Author: User
Tags see definition

Http://csharpindepth.com/Articles/General/Unicode.aspx

Scope of this page

This is a big topic. Don ' t expect this page to does more than scratch the surface-indeed, if you believe you ' re already fairly experienced and Knowledgeable about character encodings and the such, this page may well is not having anything new or useful for you. However, there is still many people who don ' t understand the difference between binary and text, or know what a character encoding is, etc. It's for these people the This page has been written. It mentions a few advanced topics, but only to make the reader aware of their existence, rather than to give much guidance on them.

Resources

can go to the original view link

Binary and Text-a Big distinction

Most modern computer languages (and some older ones) make a big distinction between "binary" content and "character" (or " Text ") content.

The difference is largely the same as the instinctive instinctive; intuitive; natural one, but for the purposes of clarity clear, clear; transparent, I ll define it here as:

    • Binary content is a sequence of octets eight-bit bytes (bytes in common parlance "in common parlance as the saying goes") with no intrinsic intrinsic, inherent mean ing attached. Even though there may is external means of understanding a piece of binary content to is, say, a picture, or an executable File, the content itself is just a sequence of bytes. (Note for pedantic pedantic, pedantic readers:from now on, I won ' t use the word "octet". I ' ll use "byte" instead, even though strictly speaking a byte Needn ' t is an octet. There has been architectures with 9-bit bytes, for instance. I Don ' t believe that's a particularly relevant or useful distinction to make in this day and age, and readers is likely T o be more comfortable with the word "byte".)
    • Character content is a sequence of characters.

The Unicode Glossary defines a character as:

    1. The smallest component of written language that have semantic semantics of value; Refers to means the abstract meaning and/or shape, rather than a specific shape (see also glyph graphic characters), though in code tables s ome form of visual representation is essential basic, necessary for the reader ' s understanding.
    2. Synonym synonyms for abstract character. (See Definition D3 in section 3.3, characters and Coded representations. http://www.unicode.org/versions/Unicode7.0.0/ ch03.pdf#g2212)
    3. The basic unit of encoding for the Unicode character encoding.
    4. The 中文版 name for the ideographic written elements of Chinese origin. (See Ideograph (2).)

That could or May is a terribly useful definition to you, but the most part of you can again use your instinctive un Derstanding-a character is something like "The capital letter A", "the digit 1" etc. There is other characters which is less obvious, such as:combining characters such as "an acute accent", control Charac Ters such as "newline", and formatting characters (invisible, but affect surrounding characters). The important thing is this these is fundamentally "text" in some form or other. They has some meaning attached to them.

Now, unfortunately in the past, this distinction have been very blurred-c programmers is often used to thinking of "byte "and" char "as being interchangeable, to the extent that they would talk about reading a certain number of characters, Eve n when the content is entirely binary. In modern environments such as. NET and Java, where the distinction are clear and present in the IO libraries, this can Lea D to people attempting to copy binary files by reading and writing characters, resulting in corrupt output.

Unicode and. NET

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.