[Meta http-equiv = "Content-Type" content = "text/html; charset = *] Meaning

Source: Internet
Author: User
Tags coding standards
Meta, one of the important tags in the head area of the webpage HTML language
The HTTP-EQUIV is similar to the HTTP header protocol, which responds to some useful information for the browser to help display the web content correctly and precisely. Common http-equiv types include:

Content-Type and content-language (display character set setting)
Note: Set the character set used on the page to indicate that the text used by the homepage is already in the language. the browser will call the corresponding character set to display the page content.
<Meta http-equiv = "Content-Type" content = "text/html; charset = gb2312"> this meta tag defines the character set used for the HTML page as gb2132, it is the Chinese character code of the national standard. If you replace "charset = gb2312" with "big5", the character set used on this page is the big5 code of Traditional Chinese. When you browse some overseas sites, the IE browser will prompt you to download XX language support to display the page correctly. This function is used to read the Content-Type attribute of the meta tag on the HTML page and find out which character set to be used to display the page. If the corresponding character set is not installed in the system, ie will prompt to download. Other languages also correspond to different charsets, such as the Japanese character set is "iso-2022-jp", Korean is "ks_c_5601 ".
Content-Type can also be: text/XML and other document types charset options: ISO-8859-1 (English), big5, UTF-8, shift-JIS, EUC, Koi8-2, US-ASCII, x-Mac-Roman, iso-8859-2, X-Mac-ce, iso-2022-jp, X-sjis, X-EUC-JP, EUC-KR, iso-2022-kr, gb2312, gb_2312-80, x-EUC-TW, x-cns11643-1, x-cns11643-2 and other character sets; Content-language content can also be: En, FR and other languagesCode.

Character Set and encoding
Different ANSI coding standards set by countries and regions only specify the "characters" required by their respective languages ". For example, the Chinese Character Standard (gb2312) does not specify how to store Korean characters. The content specified by these ANSI coding standards has two meanings:
1. characters used. That is to say, which Chinese characters, letters and symbols will be included in the income standard. The set containing "characters" is called "character set ".
2. Specify whether each "character" is stored in one or more bytes and which bytes are used for storage. This rule is called "encoding ".
When coding standards are set for various countries and regions, both "character sets" and "encoding" are generally set at the same time. Therefore, what we usually call a "Character Set", such as gb2312, GBK, and JIS, besides the meanings of "Character Set" and "encoding.
The Unicode Character Set contains all the characters used in various languages ". There are many types of standards used to encode Unicode character sets, such as: UTF-8, UTF-7, UTF-16, unicodelittle, unicodebig and so on.
1, ISO-8859-1:
The simplest encoding rule. Each byte is a Unicode character. For example, when the two bytes [0xd6, 0xd0] are converted to a string through a iso-8859-1, two Unicode characters, namely, [0x00d6, 0x00d0], are obtained ".
Otherwise, the Unicode string through the iso-8859-1 into a byte string, only normal conversion 0 ~ A character in the range of 255.

2, gb2312, big5, shift_jis, ISO-8859-2
When Unicode strings are converted to byte strings through ANSI encoding, a Unicode character may be converted into one or more bytes according to their respective encoding rules.
If you convert a byte string to a string, multiple bytes may also be converted into one character. For example, when the two bytes [0xd6, 0xd0] are converted to a string through gb2312, a [0x4e2d] character is obtained, that is, the word "medium.
Features of "ANSI encoding:
1. All these "ANSI encoding standards" can only process Unicode characters in their respective languages.
2. The relationship between the "Unicode Character" and "converted bytes" is defined by humans.

3. UTF-8, UTF-16, unicodebig
similar to ANSI encoding, when a string is converted to a byte string through unicode encoding, a unicode character may be converted into one or more bytes.
what is different from "ANSI encoding" is:
1. These "unicode encoding" can process all Unicode characters.
2. "Unicode Character" and "converted bytes" can be calculated.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.