Differences between urlencode and rawurlencode in php: phprawurlencode

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Some time ago, I encountered a "URL plus cause error" BUG. The cause of this bug is that I used the urlencode function in the URL, which converts spaces to the plus signs, in this case, the URL parsing error occurs, and the space can be parsed only after being converted to % 20. In this case, we need to use the rawurlencode function.

The following describes the differences between urlencode and rawurlencode:

Urlencode function:

Returns a string -_. all other non-alphanumeric characters will be replaced with a semicolon (%) followed by two hexadecimal numbers, and spaces will be encoded as the plus sign (+ ). This encoding method is the same as that for WWW form POST data and the same as that for application/x-www-form-urlencoded. For historical reasons, this encoding is different from RFC1738 encoding (see rawurlencode () in Space Encoding As the plus sign (+.

Rawurlencode function:

Returns a string. All non-alphanumeric characters except-_. In this string will be replaced with a semicolon (%) followed by two hexadecimal numbers. This is the encoding described in RFC 3986 to protect the original characters from being interpreted as special URL delimiters, at the same time, the URL format is protected to prevent the transmitted media (such as some mail systems) from interfering with character conversion. The following is an example:

<? Php $ string = "hello world"; echo urlencode ($ string ). '<br/>'; // output: hello + worldecho rawurldecode ($ string ). '<br/>'; // output: hello % 20 world?>

Comparison of specific examples:

<? Phpfor ($ I = 0x20; $ I <0x7f; $ I ++) {$ str. = dechex ($ I) ;}$ asscii = pack ("H *", $ str); echo "All printable asscii characters: (from space ~) N ". $ asscii. "\ n"; echo "urlencode result: \ n ". urlencode ($ asscii); echo "\ n"; echo "urlencode is not encoded in the words http://www.bkjia.com/fu :n n ". preg_replace ("/%. {2}/"," ", urlencode ($ asscii); echo" \ n "; echo" rawurlencode result: \ n ". rawurlencode ($ asscii); echo "\ n"; echo "rawurlencode characters not encoded: \ n ". preg_replace ("/%. {2}/"," ", rawurlencode ($ asscii); echo" \ n "; exit;?> Output result: --------------------------- all printable asscii characters: (from space ~)! "# $ % & '() * +,-./0123456789:; <=>? @ ABCDEFGHIJKLMNOPQRSTUVWXYZ [\] ^ _ abcdefghijklmnopqrstuvwxyz {| }~ Urlencode result: + % 21% 22% 23% 24% 25% 26% 27% 28% 29% 2A % 2B % 2C -. % 2F0123456789% 3A % 3B % 3C % 3D % 3E % 3F % 40 ABCDEFGHIJKLMNOPQRSTUVWXYZ % 5B % 5C % 5D % 5E _ % 60 bytes % 7B % 7C % 7D % 7 Eurlencode is not encoded character: + -. 0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ_abcdefghijklmnopqrstuvwxyzrawurlencode: % 20% 21% 22% 23% 24% 25% 26% 27% 2A % 2B % 2C -. % 2F0123456789% 3A % 3B % 3C % 3D % 3E % 3F % 40 ABCDEFGHIJKLMNOPQRSTUVWXYZ % 5B % 5C % 5D % 5E _ % 60 bytes % 7B % 7C % 7D % 7 Erawurlencode is not encoded character: -. 0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ_abcdefghijklmnopqrstuvwxyz

Compare the two results:

1. Numbers and uppercase/lowercase letters are not encoded.
2. The minus sign, DoT number, and underline are not encoded.
3. rawurlencode encodes one "plus sign" more than urlencode.

Differences between escape and encodeURIComponent in JavaScript:

>>> Console. log (encodeURIComponent ("Unified Registration 1"); % E7 % BB % 9F % E4 % B8 % 80% E6 % B3 % A8 % E5 % 86% 8C1> console. log (escape ("Unified Registration 1"); % u7EDF % u4E00 % u6CE8 % u518C1 <? Phpecho iconv ("UTF-8", "gbk", urldecode ("% E7 % BB % 9F % E4 % B8 % 80% E6 % B3 % A8 % E5 % 86% 8C1 ")); echo "\ n"; echo urldecode ("% u7EDF % u4E00 % u6CE8 % u518C1"); // You can // echo iconv ("UTF-8 ", "gbk", unescape ("% u7EDF % u4E00 % u6CE8 % u518C1"); exit;?> Output result: ============================================== Unified Registration 1% u7EDF % u4E00 % u6CE8 % u518C1 ========================== ======

Result description:

1. encodeURIComponent always converts the input to utf8 encoding, encoded by byte

2. escape is encoded according to unicode because it also encodes insecure characters in the url, so it can also be used for encoding in the url. However, the server will not automatically decode it, the following provides a PHP decoding function, which is found in the manual:

<?phpfunction unescape($str) {   $str = rawurldecode($str);   preg_match_all("/(?:%u.{4})|&#x.{4};|&#d+;|.+/U",$str,$r);   $ar = $r[0];   foreach($ar as $k=>$v) {     if(substr($v,0,2) == "%u")       $ar[$k] = iconv("UCS-2","UTF-8",pack("H4",substr($v,-4)));     elseif(substr($v,0,3) == "&#x")       $ar[$k] = iconv("UCS-2","UTF-8",pack("H4",substr($v,3,-1)));     elseif(substr($v,0,2) == "&#") {       $ar[$k] = iconv("UCS-2","UTF-8",pack("n",substr($v,2,-1)));     }   }   return join("",$ar); }?> >>> console.log(escape(" !\"#$%&'()*+,-./0123456789:;=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_abcdefghijklmnopqrstuvwxyz{|}~"));%20%21%22%23%24%25%26%27%28%29*+%2C-./0123456789%3A%3B%3C%3D%3E%3F@ABCDEFGHIJKLMNOPQRSTUVWXYZ%5B%5D%5E_%60abcdefghijklmnopqrstuvwxyz%7B%7C%7D%7E>>> console.log(encodeURIComponent("!\"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_abcdefghijklmnopqrstuvwxyz{|}~"));%20!%22%23%24%25%26'()*%2B%2C-.%2F0123456789%3A%3B%3C%3D%3E%3F%40ABCDEFGHIJKLMNOPQRSTUVWXYZ%5B%5D%5E_%60abcdefghijklmnopqrstuvwxyz%7B%7C%7D~>>> console.log(escape("!\"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_abcdefghijklmnopqrstuvwxyz{|}~").replace(/%.{2}/g,""));*+-./0123456789@ABCDEFGHIJKLMNOPQRSTUVWXYZ_abcdefghijklmnopqrstuvwxyz>>> console.log(encodeURIComponent("!\"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~").replace(/%.{2}/g,""));!'()*-.0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ_abcdefghijklmnopqrstuvwxyz~

Result comparison:

Unencoded characters in escape: * +-./@ _, a total of 7 characters

Unencoded characters of encodeURIComponent :! '()*-._~ A total of 9

The difference between urlencode and rawurlencode in php is the whole content shared by xiaobian. I hope you can give us a reference and support for our guests.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More