For URL encoding in PHP, urlencode () or rawurlencode () can be used. The difference between the two is that the former encodes space into +, while the latter encodes space into % 20, however, it should be noted that only part of the URL should be encoded. Otherwise, the colon and backslash in the URL will be escaped. The following is a detailed explanation :///\\\
String urlencode (string str)
Returns a string -_. all other non-alphanumeric characters will be replaced with a semicolon (%) followed by two hexadecimal numbers, and spaces will be encoded as the plus sign (+ ). This encoding method is the same as that for WWW form POST data and the same as that for application/x-www-form-urlencoded. For historical reasons, this encoding is different from RFC1738 encoding (see rawurlencode () in space encoding as the plus sign (+. This function allows you to encode a string and use it in the URL request section. It also allows you to pass variables to the next page: Example 1. urlencode () example
The code is as follows:
Echo '';
?>
Note: Be careful with the variables that match the HTML object. Image &,©And £4 will be parsed by the browser, and the expected variable name will be replaced by the actual object. This is obviously confusing. W3C has warned people for years. Reference address: http://www.w3.org/TR/html4/appendix/notes.html#h-B.2.2 PHP uses the arg_separator. ini command to change the parameter delimiter to the semicolon recommended by W3C. Unfortunately, most user proxies do not send form data in the semicolon separator format. A simple solution is to use & replace & as the separator. You do not need to modify the PHP arg_separator. Make it still &, and use only htmlentities (urlencode ($ data) to encode your URL.
Example 2. urlencode () and htmlentities () examples
The code is as follows:
Echo '';
?>
String urlencode (string str)
Returns a string. all non-alphanumeric characters except-_. in this string will be replaced with a semicolon (%) followed by two hexadecimal numbers. This encoding is described in RFC 1738 to protect the original characters from being interpreted as special URL delimiters and protect the URL format to prevent them from being transmitted to media (like some email systems) use character conversion. For example, if you want to include a password in an ftp url:
Example 1. rawurlencode () Example 1
The code is as follows:
Echo ''@ ftp.my.com/x.txt"> ';
?>
Or, if you want to pass the information through the PATH_INFO component of the URL:
Example 2. rawurlencode () Example 2
The code is as follows:
Echo 'rawurlencode ('sales and marketing/Miami '),' "> ';
?>
During decoding, you can use the corresponding urldecode () and rawurldecode (). correspondingly, rawurldecode () does not decode the plus sign ('+') as a space, while urldecode () can. The following is a detailed example:
String urldecode (string str)
Decodes any % # in the encoded string ##. Returns the decoded string. Example 1. urldecode () example
The code is as follows:
$ A = explode ('&', $ QUERY_STRING );
$ I = 0;
While ($ I <count ($ )){
$ B = split ('=', $ a [$ I]);
Echo 'Value for parameter ', htmlspecialchars (urldecode ($ B [0]),
'Is, htmlspecialchars (urldecode ($ B [1]),"
N ";
$ I ++;
}
?>
String rawurldecode (string str)
Returns a string of hundreds of semicolons (%) followed by two hexadecimal numbers. all strings are replaced with original characters.
Example 1. rawurldecode () example
The code is as follows:
Echo rawurldecode ('foo % 20bar % 40baz'); // foo bar @ baz
?>
However, one thing to note is that the strings decoded by urldecode () and rawurldecode () are encoded in the UTF-8 format, if the URL contains Chinese characters, and the page settings is not the UTF-8, the decoded string to convert, in order to display normally!
Another problem is that the obtained URL is not % nn n = {0 .. f} format, but % unnnn n = {0 .. f} format. at this time, using urldecode () and rawurldecode () cannot be decoded correctly, but the following function can be used to decode it correctly:
The code is as follows:
Function utf8RawUrlDecode ($ source)
{
$ DecodedStr = "";
$ Pos = 0;
$ Len = strlen ($ source );
While ($ pos <$ len ){
$ CharAt = substr ($ source, $ pos, 1 );
If ($ charAt = '% '){
$ Pos ++;
$ CharAt = substr ($ source, $ pos, 1 );
If ($ charAt = 'u '){
// We got a unicode character
$ Pos ++;
$ UnicodeHexVal = substr ($ source, $ pos, 4 );
$ Unicode = hexdec ($ unicodeHexVal );
$ Entity = "& #". $ unicode .';';
$ DecodedStr. = utf8_encode ($ entity );
$ Pos + = 4;
}
Else {
// We have an escaped ascii character
$ HexVal = substr ($ source, $ pos, 2 );
$ DecodedStr. = chr (hexdec ($ hexVal ));
$ Pos + = 2;
}
} Else {
$ DecodedStr. = $ charAt;
$ Pos ++;
}
}
Return $ decodedStr;
}