PHP Chinese URL transcoding

Source: Internet
Author: User
Tags form post alphanumeric characters

   
In PHP, the URL is encoded, you can use UrlEncode () or Rawurlencode (), the difference is that the former to encode the space as ' + ', and the latter to encode the space as '% 20 ', however, it should be noted that only part of the URL should be encoded at the time of encoding, otherwise the colon and backslash in the URL will also be escaped. Here is a detailed explanation:

String UrlEncode (String str)

Returns a string, in addition to-_, in this string. All non-alphanumeric characters are replaced with a percent sign (%) followed by a two-digit hexadecimal number, and a space is encoded as a plus (+). This encoding is the same as the WWW form POST data, and is encoded in the same way as the application/x-www-form-urlencoded media type. For historical reasons, this encoding differs from RFC1738 encoding (see Rawurlencode ()) in terms of encoding spaces as plus signs (+). This function makes it easy to encode a string and use it for the request part of the URL, and it also facilitates the passing of a variable to the next page:

Example 1. UrlEncode () example

Echo ';
?>

Note: Be careful with variables that match the HTML entity. such as &, ©, and £ will be parsed by the browser and replaced with the actual entity with the expected variable name. This is a clear confusion, and the crowd has been telling people for years. Reference address: http://www.w3.org/TR/html4/appendix/notes.html#h-B.2.2 PHP supports the use of the Arg_separator. INI directive to turn the parameter delimiter into a semicolon recommended by the consortium. Unfortunately, most user agents do not send form data in a semicolon-delimited format. A simpler solution is to use & instead of & As a delimiter. You don't need to modify PHP's arg_separator for this. Make it still &, and use only Htmlentities (UrlEncode ($data)) to encode your URL.

Example 2. UrlEncode () and htmlentities () example

Echo ';
?>

String UrlEncode (String str)

Returns a string, in addition to-_, in this string. All non-alphanumeric characters are replaced with a percent (%) followed by a two-digit hexadecimal number. This is the encoding described in RFC 1738 to protect the literal characters from being interpreted as a special URL delimiter, while protecting the URL format from being confused by the transfer of media (like some messaging systems) using character conversions. For example, if you want to include a password in the FTP URL:

Example 1. Rawurlencode () Example 1

Echo ' @ftp. My.com/x.txt ' > ';
?>

Or, if you want to pass the message through the Path_info component of the URL:

Example 2. Rawurlencode () Example 2

Echo ' Rawurlencode (' Sales and Marketing/miami '), ' > ';
?>

When decoding, you can use the corresponding UrlDecode () and Rawurldecode (), accordingly, Rawurldecode () will not decode the plus sign (' + ') to a space, and UrlDecode () can. Here is a detailed example:

String UrlDecode (String str)

Decodes any%## in the encoded string given. Returns the decoded string.

Example 1. UrlDecode () example

$a = Explode (' & ', $QUERY _string);
$i = 0;
while ($i < count ($a)) {
$b = Split (' = ', $a [$i]);
Echo ' Value for parameter ', Htmlspecialchars (UrlDecode ($b [0])),
' Is ', Htmlspecialchars (UrlDecode ($b [1])), "
\ n ";
$i + +;
}
?>

String Rawurldecode (String str)

Returns a string that is replaced by a sequence of percent semicolons (%) followed by a two-digit hexadecimal number in this string.

Example 1. Rawurldecode () example


echo rawurldecode (' Foo%20bar%40baz '); Foo Bar@baz

?>

However, one thing to note is that UrlDecode () and Rawurldecode () decode the string is UTF-8 format encoding, if the URL contains Chinese words, and the page set is not UTF-8, then the decoded string to be converted to normal display!

Another problem is that the URL obtained is not the format of the%%nn N={0..F}, but the format of%unnnn N={0..F}, and the use of UrlDecode () and Rawurldecode () is not correctly decoded, but the following function to correctly decode :

function Utf8rawurldecode ($source)
{
$DECODEDSTR = "";
$pos = 0;
$len = strlen ($source);
while ($pos < $len) {
$charAt = substr ($source, $pos, 1);
if ($charAt = = '% ') {
$pos + +;
$charAt = substr ($source, $pos, 1);
if ($charAt = = ' U ') {
We got a Unicode character
$pos + +;
$unicodeHexVal = substr ($source, $pos, 4);
$unicode = Hexdec ($unicodeHexVal);
$entity = "the". $unicode. ';';
$decodedStr. = Utf8_encode ($entity);
$pos + = 4;
}
else {
We have an escaped ASCII character
$hexVal = substr ($source, $pos, 2);
$decodedStr. = Chr (Hexdec ($hexVal));
$pos + = 2;
}
} else {
$decodedStr. = $charAt;
$pos + +;
}
}
return $decodedStr;
}
  

  • Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.