Differences between urlencode and rawurlencode in php: phprawurlencode

Source: Internet
Author: User
Tags form post urlencode alphanumeric characters

Differences between urlencode and rawurlencode in php: phprawurlencode

Some time ago, I encountered a "URL plus cause error" BUG. The cause of this bug is that I used the urlencode function in the URL, which converts spaces to the plus signs, in this case, the URL parsing error occurs, and the space can be parsed only after being converted to % 20. In this case, we need to use the rawurlencode function.

The following describes the differences between urlencode and rawurlencode:

Urlencode function:

Returns a string -_. all other non-alphanumeric characters will be replaced with a semicolon (%) followed by two hexadecimal numbers, and spaces will be encoded as the plus sign (+ ). This encoding method is the same as that for WWW form POST data and the same as that for application/x-www-form-urlencoded. For historical reasons, this encoding is different from RFC1738 encoding (see rawurlencode () in Space Encoding As the plus sign (+.

Rawurlencode function:

Returns a string. All non-alphanumeric characters except-_. In this string will be replaced with a semicolon (%) followed by two hexadecimal numbers. This is the encoding described in RFC 3986 to protect the original characters from being interpreted as special URL delimiters, at the same time, the URL format is protected to prevent the transmitted media (such as some mail systems) from interfering with character conversion. The following is an example:

<? Php $ string = "hello world"; echo urlencode ($ string ). '<br/>'; // output: hello + worldecho rawurldecode ($ string ). '<br/>'; // output: hello % 20 world?>

Comparison of specific examples:

<? Phpfor ($ I = 0x20; $ I <0x7f; $ I ++) {$ str. = dechex ($ I) ;}$ asscii = pack ("H *", $ str); echo "All printable asscii characters: (from space ~) N ". $ asscii. "\ n"; echo "urlencode result: \ n ". urlencode ($ asscii); echo "\ n"; echo "urlencode is not encoded in the words http://www.bkjia.com/fu :n n ". preg_replace ("/%. {2}/"," ", urlencode ($ asscii); echo" \ n "; echo" rawurlencode result: \ n ". rawurlencode ($ asscii); echo "\ n"; echo "rawurlencode characters not encoded: \ n ". preg_replace ("/%. {2}/"," ", rawurlencode ($ asscii); echo" \ n "; exit;?> Output result: --------------------------- all printable asscii characters: (from space ~)! "# $ % & '() * +,-./0123456789:; <=>? @ ABCDEFGHIJKLMNOPQRSTUVWXYZ [\] ^ _ abcdefghijklmnopqrstuvwxyz {| }~ Urlencode result: + % 21% 22% 23% 24% 25% 26% 27% 28% 29% 2A % 2B % 2C -. % 2F0123456789% 3A % 3B % 3C % 3D % 3E % 3F % 40 ABCDEFGHIJKLMNOPQRSTUVWXYZ % 5B % 5C % 5D % 5E _ % 60 bytes % 7B % 7C % 7D % 7 Eurlencode is not encoded character: + -. 0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ_abcdefghijklmnopqrstuvwxyzrawurlencode: % 20% 21% 22% 23% 24% 25% 26% 27% 2A % 2B % 2C -. % 2F0123456789% 3A % 3B % 3C % 3D % 3E % 3F % 40 ABCDEFGHIJKLMNOPQRSTUVWXYZ % 5B % 5C % 5D % 5E _ % 60 bytes % 7B % 7C % 7D % 7 Erawurlencode is not encoded character: -. 0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ_abcdefghijklmnopqrstuvwxyz

Compare the two results:

1. Numbers and uppercase/lowercase letters are not encoded.
2. The minus sign, DoT number, and underline are not encoded.
3. rawurlencode encodes one "plus sign" more than urlencode.

Differences between escape and encodeURIComponent in JavaScript:

>>> Console. log (encodeURIComponent ("Unified Registration 1"); % E7 % BB % 9F % E4 % B8 % 80% E6 % B3 % A8 % E5 % 86% 8C1> console. log (escape ("Unified Registration 1"); % u7EDF % u4E00 % u6CE8 % u518C1 <? Phpecho iconv ("UTF-8", "gbk", urldecode ("% E7 % BB % 9F % E4 % B8 % 80% E6 % B3 % A8 % E5 % 86% 8C1 ")); echo "\ n"; echo urldecode ("% u7EDF % u4E00 % u6CE8 % u518C1"); // You can // echo iconv ("UTF-8 ", "gbk", unescape ("% u7EDF % u4E00 % u6CE8 % u518C1"); exit;?> Output result: ============================================== Unified Registration 1% u7EDF % u4E00 % u6CE8 % u518C1 ========================== ======

Result description:

1. encodeURIComponent always converts the input to utf8 encoding, encoded by byte

2. escape is encoded according to unicode because it also encodes insecure characters in the url, so it can also be used for encoding in the url. However, the server will not automatically decode it, the following provides a PHP decoding function, which is found in the manual:

<?phpfunction unescape($str) {   $str = rawurldecode($str);   preg_match_all("/(?:%u.{4})|&#x.{4};|&#d+;|.+/U",$str,$r);   $ar = $r[0];   foreach($ar as $k=>$v) {     if(substr($v,0,2) == "%u")       $ar[$k] = iconv("UCS-2","UTF-8",pack("H4",substr($v,-4)));     elseif(substr($v,0,3) == "&#x")       $ar[$k] = iconv("UCS-2","UTF-8",pack("H4",substr($v,3,-1)));     elseif(substr($v,0,2) == "&#") {       $ar[$k] = iconv("UCS-2","UTF-8",pack("n",substr($v,2,-1)));     }   }   return join("",$ar); }?> >>> console.log(escape(" !\"#$%&'()*+,-./0123456789:;=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_abcdefghijklmnopqrstuvwxyz{|}~"));%20%21%22%23%24%25%26%27%28%29*+%2C-./0123456789%3A%3B%3C%3D%3E%3F@ABCDEFGHIJKLMNOPQRSTUVWXYZ%5B%5D%5E_%60abcdefghijklmnopqrstuvwxyz%7B%7C%7D%7E>>> console.log(encodeURIComponent("!\"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_abcdefghijklmnopqrstuvwxyz{|}~"));%20!%22%23%24%25%26'()*%2B%2C-.%2F0123456789%3A%3B%3C%3D%3E%3F%40ABCDEFGHIJKLMNOPQRSTUVWXYZ%5B%5D%5E_%60abcdefghijklmnopqrstuvwxyz%7B%7C%7D~>>> console.log(escape("!\"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_abcdefghijklmnopqrstuvwxyz{|}~").replace(/%.{2}/g,""));*+-./0123456789@ABCDEFGHIJKLMNOPQRSTUVWXYZ_abcdefghijklmnopqrstuvwxyz>>> console.log(encodeURIComponent("!\"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~").replace(/%.{2}/g,""));!'()*-.0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ_abcdefghijklmnopqrstuvwxyz~

Result comparison:

Unencoded characters in escape: * +-./@ _, a total of 7 characters

Unencoded characters of encodeURIComponent :! '()*-._~ A total of 9

The difference between urlencode and rawurlencode in php is the whole content shared by xiaobian. I hope you can give us a reference and support for our guests.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.