Using JS to the URL in the Chinese characters to escape code.
<a href= "onclick=" window.open (' product_list.php?p_sort= ' +escape (' Script House ')); " > Click the link after the effect:
Reference: http://127.0.0.1/shop/product_list.php?p_sort=PHP%u5F00%u53D1%u8D44%u6E90%u7F51
It is obvious that using PHP's UrlDecode () or Base64_decode () is not solvable.
workaround, write an inverse function in PHP:
Copy Code code as follows:
function Js_unescape ($str) {
$ret = ';
$len = strlen ($STR);
for ($i = 0; $i < $len; $i + +)
{
if ($str [$i] = = '% ' && $str [$i +1] = = ' u ')
{
$val = Hexdec (substr ($str, $i +2, 4));
if ($val < 0x7f) $ret. = Chr ($val);
else if ($val < 0x800) $ret. = Chr (0xc0| ( $val >>6)). Chr (0x80| ( $val &0x3f)); else $ret. = Chr (0xe0| ( $val >>12)). Chr (0x80| ( ($val >>6) &0x3f)). Chr (0x80| ( $val &0x3f));
$i + 5;
}
else if ($str [$i] = = '% ')
{
$ret. = UrlDecode (substr ($str, $i, 3));
$i + 2;
}
else $ret. = $str [$i];
}
return $ret;}
Note that the JS code will be automatically converted into UTF-8, so must be coded conversion to get the correct results, otherwise it will be Chinese garbled. But if you use the UTF-8 code, you don't have to do this.
The code is as follows: Print iconv (' utf-8 ', ' gb2312 ', Js_unescape ($_request[' p_sort '));
Here we have successfully reversed the JS escape code.
As follows:
In addition, I found a php to implement JS escape encoding function:
Copy Code code as follows:
function Phpescape ($STR)
{
$sublen =strlen ($STR);
$retrunString = "";
for ($i =0; $i < $sublen; $i + +)
{
if (Ord ($str [$i]) >=127)
{
$tmpString =bin2hex (iconv ("gb2312", "Ucs-2", substr ($str, $i, 2));
//$tmpString =substr ($tmpString, 2,2). substr ($tmpString, 0,2); You may want to open this item under window
$retrunString. = "%u". $tmpString;
$i + +;
} else
{
$retrunString. = "%" Dechex (ord ($str [$i]));
}
}
return $retrunString;
}
Chinese is not supported in JSON, use it to transfer Chinese data will appear data loss or garbled, must be sent before the transmission of the string to encode, because the transmission in the past need to use JS data analysis, considering the JS has unescape function, so if there is an escape function in PHP, the data To encode, in the client to decode with unescape, so it will be more convenient.
First search on the internet, a lot of PHP to implement the escape function, the same as the following:
Copy Code code as follows:
function Phpescape ($str) {
Preg_match_all ("/[x80-xff].| [x01-x7f]+/", $str, $r);
$ar = $r [0];
foreach ($ar as $k => $v) {
if (ord ($v [0]) < 128)
$ar [$k] = Rawurlencode ($v);
Else
$ar [$k] = "%u". Bin2Hex (Iconv ("GB2312", "UCS-2", $v));
}
return Join ("", $ar);
}
This function works very well, but perhaps some novice does not understand the principle of the function (such as me), it is always uneasy to use, and now I will explain the principle of this function. And I think it's like standing on the shoulders of a giant with someone else's code, but if you don't understand someone else's code, you'll fall to the ground sooner or later.
The first sentence:preg_match_all ("/[x80-xff].| [x01-x7f]+/, $str, $r); This is a regular expression that matches all the characters in the string, [X80-xff]. Matching is the Chinese character, X represents the matching character of the 16-encoded, [] is a class selector, "." represents any character, so [X80-xff]. The match is two characters, the first of which is the 16 character from 80 to FF, which happens to be the first character of the encoding. This will be a complete match of a Chinese character. On the code of Chinese characters in Unicode, we can search the Internet. Similarly, [x01-x7f]+ English string, because the earliest English is ASCII encoding, the encoding value is less than 128, that is, 16 binary from 01 to 7f, "+" represents one or more characters, so [x01-x7f]+ can match consecutive multiple English strings.
Copy Code code as follows:
$ar = $r [0]; $r [0] where the storage is matched to the array
foreach ($ar as $k => $v) {
if (ord ($v [0]) < 128)//If the character encoding value is less than 128, the description is an English character
$ar [$k] = Rawurlencode ($v); Directly using Rawurlencode encoding
Else
$ar [$k] = "%u". Bin2Hex (Iconv ("GB2312", "UCS-2", $v)); Otherwise, use the Iconv function to convert Chinese characters into ucs-2 encoding, which is Unicode encoding
}
In JavaScript, you can use unescape to decode it.
U0391-uffe5 and U4e00-u9fa5 to match Chinese
But it seems that the former contains a-¥ under Chinese characters and the latter may be pure Chinese characters.
Where the decoding function is:
Copy Code code as follows:
function unescape ($str) {
$str = Rawurldecode ($STR);
Preg_match_all ("/%u.{4}|& #x .{4};|& #d +;|.+/u", $str, $r);
$ar = $r [0];
foreach ($ar as $k => $v) {
if (substr ($v, 0,2) = = "%u")
$ar [$k] = Iconv ("UCS-2", "GBK", Pack ("H4", substr ($v,-4));
ElseIf (substr ($v, 0,3) = = "& #x")
$ar [$k] = Iconv ("UCS-2", "GBK", Pack ("H4", substr ($v, 3,-1));
ElseIf (substr ($v, 0,2) = = "&#") {
$ar [$k] = Iconv ("UCS-2", "GBK", Pack ("n", substr ($v, 2,-1));
}
}
return Join ("", $ar);
}
One, encoding range
1. GBK (gb2312/gb18030)
x00-xff GBK double-byte encoding range
x20-x7f ASCII
xa1-xff Chinese
x80-xff Chinese
2. UTF-8 (Unicode)
u4e00-u9fa5 (Chinese)
x3130-x318f (Korean
xac00-xd7a3 (Korean)
U0800-u4e00 (Japanese)
PS: Korean is greater than [U9fa5] character
Regular example:
preg_replace ("/([X80-xff])/", ", $str);
Preg_replace ([u4e00-u9fa5 ]/"," ", $str);