JavaScript character set encoding and decoding

Last Update:2017-02-02 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

One, Character set

1) characters and bytes (Character)

Characters are the general name of all kinds of words and symbols, including garbled characters; one character corresponds to 1~n bytes, one byte corresponds to 8 bits, and each bit is represented by 0 or 1.

2) Character set (Character set)

A character set is a collection of multiple characters, each containing a different number of characters, a common character set name: The ASCII character set, the GB2312 character set, the Unicode character set, and so on.

3) Character set encoding (Character Encoding)

Character set encoding is the conversion of symbols into computer-readable binary, decoding is to convert the binary into human readable symbols.

Most of the character sets correspond to one encoding (for example, GBK corresponds to GBK encoding), but there are many Unicode encodings, including UTF-8, UTF-16, UTF-32, and UTF-7.

The most current web page used is "UTF-8", UTF-8 use one to four bytes per character encoding, is a superset of ASCII, so the existing ASCII text does not need to convert

Second, the browser into the system

1) Use decimal and hexadecimal in HTML attributes

Decimal in HTML can use "& #56;", hexadecimal, then use "& #x5a;", more than a decimal x, the code is also more a~f these 6 characters to represent 10~15.

2) Use decimal and hexadecimal in CSS properties

CSS is compatible with HTML, and in addition, hexadecimal can be expressed in the form of "\6c".

3) JavaScript encoding Package

The string octal and hexadecimal encodings can be executed directly through Eval, where octal is denoted by "\56" and hexadecimal is denoted by "\x5c".

If a Chinese character is applied in the code and a binary encoding is required, only hexadecimal Unicode encoding can be used, and its representation is: "\u4ee3\u7801".

In the "Web front-end hacker technology Disclosure " in the encapsulation of two methods to do encoding and decoding, mainly used in the following two methods, the specific code can be seen here .

The core code is: "str.charcodeat (char). toString (binary)" and "String.fromCharCode (parseint (code, Binary)")

The charCodeAt () method returns an integer from 0 to 65535 that represents the UTF-16 code unit at the given index

The static String.fromCharCode () method returns a string created using the specified sequence of Unicode values.

It is also possible to encode and decode "Monyerjs" via an online web page.

4) HTML automatic decoding mechanism

For example, enter 16 in the Web page "& #x0048;& #x0065;& #x006c;& #x006c;& #x006f;", automatically decoding to "Hello".

There are some more well-known spaces " " is also such a mechanism.

Third, the browser code

There are three pairs of functions in JavaScript that can decode string encodings, respectively:

escape/unescape,encodeuri/decodeuri,encodeuricomponent/decodeuricomponent.

The main difference is the number of characters that are not encoded.

1) escape does not encode characters with 69

*, + 、-、.、/, @, _, 0~9, A~z, a~z and escape output%u**** format when encoding Unicode values other than 0~255.

2) encodeURI does not encode 82 characters

!, #, $, &, ', (,), *, + 、,、-、.、/,:,;, = 、?、 @, _, ~, 0~9, A~z, a~z

3) encodeURIComponent does not encode 71 characters

!, ', (,), * 、-、.、 _, ~, 0~9, A~z, a~z

Resources:

Character Set and character encoding (Charset & Encoding)

Browser-based common sense

JavaScript character set encoding and decoding

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

JavaScript character set encoding and decoding

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

JavaScript character set encoding and decoding

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support