Considerations for character sets, ASCII, iso-8859-1, relationships between symbols, and HTML URL coding in HTML

Source: Internet
Author: User

First, HTML entities

1. What is an HTML entity?

In HTMl, some characters are reserved. Less than (<) and greater than sign (>), the browser is mistaken for a label

If you want to display reserved characters correctly, you must use the character entity (HTML entities) in the HTML source code.

2. Character entity class

&entity_name or & #entity_number;

Tips:
The advantage of using entity names instead of numbers is that names are easy to remember.
However, the browser may not support all entity names (support for entity numbers is good)


3. Non-breaking space (non-breaking space)

The characters commonly used character body in 4.HTML is a nonbreaking space

Useful character entities in 5.HTML

Detailed address : http://www.w3school.com.cn/html/html_entities.asp

Second, the HTMl character set

If the HTML page is displayed correctly, the browser must know which character set to use.

1. The character set used by the World Wide Web to get up early is ASCII. ASCII supports 0-9 of digits, uppercase and lowercase English letters, and some special characters.

Since many international characters are not ASCII, the default character set for modern browsers is iso-8859-1;

If the page uses characters different from iso-8859-1, it should be specified in the <meta> tab.

2.ISO Character Set
The ISO character set is the standard set of standards defined by the International Standards Organization (ISO) for different alphabets/languages.

3.Unicode Standard

The advent of Unicode is resolved because the character sets listed above have capacity limitations and are incompatible with the multilingual environment, the Unicode Federation has developed the Unicode standard

The Unicode standard covers all characters, punctuation, and symbols in the world. Regardless of the platform, program, or language, Unicode is capable of processing, storing, and exchanging text data.

Unicode can be compatible with different character sets. The most common encoding methods are UTF-8 and utf=16.

The characters in the UTF-8 can make 1-4 bytes long. UTF-8 can represent any character in the Unicode standard. UTF-8 backwards compatible with ASCII.

UTF-8 is a common encoding for Web pages and e-mail.

Note: All HTML 4 processors have support for UTF-8, and all XHTML and XML processors support UTF-8 and UTF-16


Third, HTML ASCII

HTML and XHTML transmit data over the network with standard 7-bit ASCII code.
A 7-bit ASCII code can provide 128 different character values.

Four, HTML ISO-88591

HTML 4.01 supports the ISO 8859-1 character set

The lower part of ISO 8859-1 (code from 1 to 127) is the original 7-bit ASCII;

The higher parts of ISO 8859-1 (code from 160 to 255) all have entity names.

Most of these symbols can be used without an entity reference, but the entity name or entity provides a way to express the expression as a symbol that is not easily entered by the keyboard.

Five, HTML 4.01 symbol entity

Includes mathematical symbols, Greek characters, various arrow symbols, technical symbols, and shapes

VI. HTML URL encoding

The URL-encoded form represents ASCII characters (in hexadecimal format)
The hexadecimal format is used to display non-standard letters and characters in browsers and plugins .

URL encoding converts characters into a format that can be transmitted over the Internet.

URL Uniform Resource Locator
Web browser requests pages from Web server via URL

URL encoding
URLs can only be sent over the Internet using the ASCII character set.

Because URLs often contain characters outside of the ASCII collection, URLs must be converted to valid ASCII formats.

URL encoding uses the% followed by two-bit hexadecimal instead of non-ASCII characters.

URLs cannot contain spaces, and URL encodings typically use "+" to replace spaces.

Resources:

http://www.oschina.net/translate/what-every-web-developer-must-know-about-url-encoding# Thereservedcharactersarenotwhatyouthinktheyare
Http://www.w3schools.com/html/html_entities.asp
Http://www.w3school.com.cn/tags/html_ref_language_codes.asp
Http://www.w3school.com.cn/html/html_entities.asp
Http://en.wikipedia.org/wiki/Percent-encoding
http://blog.csdn.net/wusuopubupt/article/details/8817826
http://blog.163.com/chenzhenhua_007/blog/static/12849264920108119449881/
Http://www.qianxingzhem.com/post-1989.html
http://unicode-table.com/en/#cherokee

Summary: The basic background of HTMl, the standard has a preliminary understanding, but also need in-depth learning.

Considerations for character sets, ASCII, iso-8859-1, relationships between symbols, and HTML URL coding in HTML

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.