Copy Code code as follows:
Auto-Convert character set support array conversion
function Auto_charset ($fContents, $from = ' GBK ', $to = ' utf-8 ') {
$from = Strtoupper ($from) = = ' UTF8 '? ' Utf-8 ': $from;
$to = Strtoupper ($to) = = ' UTF8 '? ' Utf-8 ': $to;
if (Strtoupper ($from) = = Strtoupper ($to) | | | empty ($fContents) | | (Is_scalar ($fContents) &&!is_string ($fContents)) {
Does not convert if the encoding is the same or is not a string scalar
return $fContents;
}
if (is_string ($fContents)) {
if (function_exists (' mb_convert_encoding ')) {
Return mb_convert_encoding ($fContents, $to, $from);
} elseif (Function_exists (' Iconv ')) {
Return Iconv ($from, $to, $fContents);
} else {
return $fContents;
}
} elseif (Is_array ($fContents)) {
foreach ($fContents as $key => $val) {
$_key = Auto_charset ($key, $from, $to);
$fContents [$_key] = Auto_charset ($val, $from, $to);
if ($key!= $_key)
Unset ($fContents [$key]);
}
return $fContents;
}
else {
return $fContents;
}
}
When we accept the data submitted by the unknown client, because the coding of each client is not uniform, but in our server end can only be processed in one encoding, in this case will involve a character to be converted to a specific encoding problem.
This may be thought of as a direct transcoding with Iconv, but we know that the two parameters that the ICONV function needs to provide are input encoding and output encoding, and we do not know what the code is for the accepted string at this time, and it would be nice if the receiving character were to be encoded.
For such a problem, there are generally two kinds of solutions.
programme I
To specify the submitted encoding when the client submits the data, you need to give one more variable to specify the encoding.
$string = $_get[' charset '] = = ' GBK '? Iconv (' GBK ', ' utf-8 ', $_get[' str '): $_get[' str '];
In this case, it seems that this scenario is not very well used if there is no agreement or we cannot control the client.
Programme two
Detects the received data encoding directly from the server side.
This is certainly the ideal solution, now the question is how to detect the encoding of a character? In this case, mb_string in PHP, the mb_check_encoding in this extension provides the functionality we need.
$str = mb_check_encoding ($_get[' str '], ' GBK ')? Iconv (' GBK ', ' utf-8 ', $_get[' str '): $_get[' str '];
But this needs to be open mb_string this extension, and sometimes it may not be open in our production server. For this situation, you need to use the following function to determine the encoding.
The following functions are not written by me
Copy Code code as follows:
function isGb2312 ($string) {
for ($i =0; $i 127) {
if ($v >= 228) && ($v < = 233))
{
if (($i +2) >= (strlen ($string)-1)) return true;
$v 1 = ord ($string [$i +1]);
$v 2 = Ord ($string [$i +2]);
if ($v 1 >= 128) && ($v 1 < =191) && ($v 2 >=128) && ($v 2 < = 191))
return false;
Else
return true;
}
}
}
return true;
}
function IsUtf8 ($string) {
Return Preg_match ('%^:
[\x09\x0a\x0d\x20-\x7e] # ASCII
| [\XC2-\XDF] [\X80-\XBF] # Non-overlong 2-byte
| \XE0[\XA0-\XBF][\X80-\XBF] # excluding overlongs
| [\xe1-\xec\xee\xef] [\X80-\XBF] {2} # straight 3-byte
| \XED[\X80-\X9F][\X80-\XBF] # excluding surrogates
| \XF0[\X90-\XBF][\X80-\XBF]{2} # Planes 1-3
| [\xf1-\xf3] [\X80-\XBF] {3} # planes 4-15
| \XF4[\X80-\X8F][\X80-\XBF]{2} # Plane 16
) *$%xs ', $string);
}
Here we can make any one of these functions to implement the code detection. and converts it to the specified encoding.
$str = isGb2312 ($_get[' str '], ' GBK ')? Iconv (' GBK ', ' utf-8 ', $_get[' str '): $_get[' str '];