"Original" Talk about URL encoding issues

Source: Internet
Author: User

Recently encountered in the project need to encode a problem, in encode and encodeuricomponent around a small circle, so intends to summarize the JS coding problem, there are many similar articles on the Internet, but, summed up the thing is their own drop

Why do I need to encode URIs?

for URLs, the reason for encoding is because some characters in the URL cause ambiguity .

In general, URLs can use only English letters, Arabic numerals, and certain punctuation marks, and cannot use other text or symbols. This is because the network standard RFC 1738 makes the hard rules:

Original: "... Only alphanumerics [0-9a-za-z], the special characters "$-_.+!*" (), "[not including the quotes-ed], and reserved charact ERs used for their reserved purposes could be used unencoded within a URL. "
Translation: "Only letters and numbers [0-9a-za-z], some special symbols" $-_.+!* "()," [not including double quotes], and some reserved words, can be used without encoding directly for the URL. " ”

However, the network standard does not stipulate how to encode, to the browser to control their own, the browser is currently a common practice is in addition to a-za-z0-9.-_, all do a% replacement.

Introduction to three coding methods

JavaScript provides 3 pairs of functions used to encode URLs to get a valid URL, respectively.

Escape code--"Unescape decoding"

encodeURI code--"decodeURI decoding

encodeURIComponent code--"decodeURIComponent decoding.

Decoding and encoding process is reversible, so know the coding process can know the decoding process, so only need to introduce the coding process can be

The following table lists the security characters for these three functions (that is, the function does not encode these characters)

What are ASCII characters

Before introducing the three methods, let's look at the ASCII characters.

Wikipedia:

ASCII (American Standard Code for information Interchange, US Information Interchange standards Codes) is a set of computer coding systems based on the Latin alphabet. It is mainly used to display modern English and other Western European languages. It is now the most versatile single-byte encoding system and is equivalent to ISO/IEC 646.

The following is a partial ASCII code table:

Escape---> Unescape

The method does not encode ASCII letters and numbers, nor does it encode the following ASCII punctuation marks: * @-_ +. / 。 All other characters will be replaced by escape sequences.

This method belongs to the obsolete product, ECMAScript v3 against using this method, the application uses encodeURI() and encodeuricomponent() to replace it. So there is no too much introduction to escape.

In other words, there is nothing better than to use escape to code, there will be problems, before a project encountered a pit.

encodeURI---> decodeURI

encodeURI focuses on encoding the entire URL, in addition to the usual symbols, to some other symbols that have special meanings in the URL "; / ? : @ & = + $, # ", do not encode. After encoding, it outputs the utf-8 form of the symbol and adds a% before each byte.

Also because encodeURI does not encode "; / ? : @ & = + $, # "etc., so it works well to encode the full URL, because these characters are used to split the host and path. It corresponds to the decoding is decodeURI

encodeURIComponent--->decodeuricomponent

Judging from the list of safe character ranges mentioned above, we will find that the encodeURIComponent encodes a larger range of characters than encodeURI.

What distinguishes it from encodeURI is that encodeURI encodes the entire URL, while encodeuricomponent encodes the individual parts of the URL. So, like "; / ? : @ & = + $, # "These will also be encoded.

So if you want to encode a part of the URL, rather than the entire URL, using encodeURIComponent encoding is a good choice. In peacetime work, with the probability of encodeuricomponent a little more

Summary

found that the URL code is not too much to chat about. Finally, make a small summary.

1, Escape () has been discarded, not to say, anyway, don't use it. 

2, encodeURI (): encode the entire URL address, for special symbols, such as "; / ? : @ & = + $, # ", without encoding, the corresponding decoding function is: decodeURI ().

3, encodeURIComponent (): can encode "; / ? : @ & = + $, # "these special characters. Encode some of the components in the URL, with more. The corresponding decoding function is decodeURIComponent ().

Finally, here's another form:

"Original" Talk about URL encoding issues

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.