Recently encountered in the project need to encode a problem, in encode and encodeuricomponent around a small circle, so intends to summarize the JS coding problem, there are many similar articles on the Internet, but, summed up the thing is their own drop
Why do I need to encode URIs?
for URLs, the reason for encoding is because some characters in the URL cause ambiguity .
In general, URLs can use only English letters, Arabic numerals, and certain punctuation marks, and cannot use other text or symbols. This is because the network standard RFC 1738 makes the hard rules:
Original: "... Only alphanumerics [0-9a-za-z], the special characters "$-_.+!*" (), "[not including the quotes-ed], and reserved charact ERs used for their reserved purposes could be used unencoded within a URL. "
Translation: "Only letters and numbers [0-9a-za-z], some special symbols" $-_.+!* "()," [not including double quotes], and some reserved words, can be used without encoding directly for the URL. " ”
However, the network standard does not stipulate how to encode, to the browser to control their own, the browser is currently a common practice is in addition to a-za-z0-9.-_, all do a% replacement.
Introduction to three coding methods
JavaScript provides 3 pairs of functions used to encode URLs to get a valid URL, respectively.
Escape code--"Unescape decoding"
encodeURI code--"decodeURI decoding
encodeURIComponent code--"decodeURIComponent decoding.
Decoding and encoding process is reversible, so know the coding process can know the decoding process, so only need to introduce the coding process can be
The following table lists the security characters for these three functions (that is, the function does not encode these characters)
What are ASCII characters
Before introducing the three methods, let's look at the ASCII characters.
Wikipedia:
ASCII (American Standard Code for information Interchange, US Information Interchange standards Codes) is a set of computer coding systems based on the Latin alphabet. It is mainly used to display modern English and other Western European languages. It is now the most versatile single-byte encoding system and is equivalent to ISO/IEC 646.
The following is a partial ASCII code table:
Escape---> Unescape
The method does not encode ASCII letters and numbers, nor does it encode the following ASCII punctuation marks: * @-_ +. / 。 All other characters will be replaced by escape sequences.
This method belongs to the obsolete product, ECMAScript v3 against using this method, the application uses encodeURI() and encodeuricomponent() to replace it. So there is no too much introduction to escape.
In other words, there is nothing better than to use escape to code, there will be problems, before a project encountered a pit.
encodeURI---> decodeURI
encodeURI focuses on encoding the entire URL, in addition to the usual symbols, to some other symbols that have special meanings in the URL "; / ? : @ & = + $, # ", do not encode. After encoding, it outputs the utf-8 form of the symbol and adds a% before each byte.
Also because encodeURI does not encode "; / ? : @ & = + $, # "etc., so it works well to encode the full URL, because these characters are used to split the host and path. It corresponds to the decoding is decodeURI
encodeURIComponent--->decodeuricomponent
Judging from the list of safe character ranges mentioned above, we will find that the encodeURIComponent encodes a larger range of characters than encodeURI.
What distinguishes it from encodeURI is that encodeURI encodes the entire URL, while encodeuricomponent encodes the individual parts of the URL. So, like "; / ? : @ & = + $, # "These will also be encoded.
So if you want to encode a part of the URL, rather than the entire URL, using encodeURIComponent encoding is a good choice. In peacetime work, with the probability of encodeuricomponent a little more
Summary
found that the URL code is not too much to chat about. Finally, make a small summary.
1, Escape () has been discarded, not to say, anyway, don't use it.
2, encodeURI (): encode the entire URL address, for special symbols, such as "; / ? : @ & = + $, # ", without encoding, the corresponding decoding function is: decodeURI ().
3, encodeURIComponent (): can encode "; / ? : @ & = + $, # "these special characters. Encode some of the components in the URL, with more. The corresponding decoding function is decodeURIComponent ().
Finally, here's another form:
"Original" Talk about URL encoding issues