- Automatic conversion character set supports array conversion
- function Auto_charset ($fContents, $from = ' GBK ', $to = ' utf-8 ') {
- $from = Strtoupper ($from) = = ' UTF8 '? ' Utf-8 ': $from;
- $to = Strtoupper ($to) = = ' UTF8 '? ' Utf-8 ': $to;
- if (Strtoupper ($from) = = = Strtoupper ($to) | | empty ($fContents) | | (Is_scalar ($fContents) &&!is_string ($fContents))) {
- If the encoding is the same or the non-string scalar is not converted
- return $fContents;
- }
- if (is_string ($fContents)) {
- if (function_exists (' mb_convert_encoding ')) {
- Return mb_convert_encoding ($fContents, $to, $from);
- } elseif (Function_exists (' Iconv ')) {
- Return Iconv ($from, $to, $fContents);
- } else {
- return $fContents;
- }
- } elseif (Is_array ($fContents)) {
- foreach ($fContents as $key = = $val) {
- $_key = Auto_charset ($key, $from, $to);
- $fContents [$_key] = Auto_charset ($val, $from, $to);
- if ($key! = $_key)
- Unset ($fContents [$key]);
- }
- return $fContents;
- }
- else {
- return $fContents;
- }
- }
Copy CodeAt this point, you may think of using iconv directly to transcode, but iconv this function needs to provide two parameters for the input encoding and output encoding, and now do not know what the accepted string is what encoding, if you can get the receiver character is what encoding is good. For this problem, there are two kinds of options for reference. Scenario One to specify the encoding to commit when the client submits the data, it is necessary to give more than one variable to specify the encoding. $string = $_get[' charset '] = = = ' GBK '? Iconv (' GBK ', ' utf-8 ', $_get[' str '): $_get[' str '); In this case, it seems that this scenario is not working well if there is no agreement or we cannot control the client. Scenario Two detects the received data encoding directly from the server side. This scheme is certainly the most ideal, and now the question is how to detect the encoding of a character? In this case, in PHP, Mb_string's mb_check_encoding in this extension provides the functionality we need. $str = mb_check_encoding ($_get[' str '), ' GBK ')? Iconv (' GBK ', ' utf-8 ', $_get[' str '): $_get[' str '); but this needs to be opened mb_string this extension, and sometimes our production server does not have this extension open. In this case, you need to use the following function to determine the encoding.
- function isGb2312 ($string) {
- for ($i =0; $i 127) {
- if (($v >= 228) && ($v < = 233))
- {
- if ($i +2) >= (strlen ($string)-1)) return true;
- $v 1 = ord ($string [$i +1]);
- $v 2 = Ord ($string [$i +2]);
- if (($v 1 >=) && ($v 1 < =191) && ($v 2 >=128) && ($v 2 < = 191))
- return false;
- Else
- return true;
- }
- }
- }
- return true;
- }
- function IsUtf8 ($string) {
- Return Preg_match ('%^ (?:
- [\x09\x0a\x0d\x20-\x7e] # ASCII
- | [\XC2-\XDF] [\X80-\XBF] # Non-overlong 2-byte
- | \XE0[\XA0-\XBF][\X80-\XBF] # excluding overlongs
- | [\xe1-\xec\xee\xef] [\X80-\XBF] {2} # straight 3-byte
- | \XED[\X80-\X9F][\X80-\XBF] # excluding surrogates
- | \XF0[\X90-\XBF][\X80-\XBF]{2} # Planes 1-3
- | [\xf1-\xf3] [\X80-\XBF] {3} # planes 4-15
- | \XF4[\X80-\X8F][\X80-\XBF]{2} # Plane 16
- ) *$%xs ', $string);
- }
Copy CodeHere we can use any of these functions to implement the encoding detection and convert it to the specified encoding. $str = isGb2312 ($_get[' str '), ' GBK ')? Iconv (' GBK ', ' utf-8 ', $_get[' str '): $_get[' str ']; |