Copy codeThe Code is as follows: // The automatic conversion character set supports array conversion.
Function auto_charset ($ fContents, $ from = 'gbk', $ to = 'utf-8 '){
$ From = strtoupper ($ from) = 'utf8 '? 'Utf-8': $ from;
$ To = strtoupper ($ to) = 'utf8 '? 'Utf-8': $;
If (strtoupper ($ from) === strtoupper ($ to) | empty ($ fContents) | (is_scalar ($ fContents )&&! Is_string ($ fContents ))){
// If the encoding is the same or the non-string scalar, the conversion is not performed.
Return $ fContents;
}
If (is_string ($ fContents )){
If (function_exists ('mb _ convert_encoding ')){
Return mb_convert_encoding ($ fContents, $ to, $ from );
} Elseif (function_exists ('iconv ')){
Return iconv ($ from, $ to, $ fContents );
} Else {
Return $ fContents;
}
} Elseif (is_array ($ fContents )){
Foreach ($ fContents as $ key => $ val ){
$ _ Key = auto_charset ($ key, $ from, $ );
$ FContents [$ _ key] = auto_charset ($ val, $ from, $ );
If ($ key! = $ _ Key)
Unset ($ fContents [$ key]);
}
Return $ fContents;
}
Else {
Return $ fContents;
}
}
When we accept the data submitted by an unknown client, the encoding of each client is inconsistent, but the server end can only process the data in one encoding mode, in this case, the problem of converting accepted characters into specific encoding is involved.
At this time, you may want to directly use iconv for transcoding, but we know that the iconv function requires two parameters: Input encoding and output encoding, at present, we do not know what encoding the accepted string is. If at this time, we can get the encoding of the received character.
There are usually two solutions for such problems.
Solution 1
When you want the client to submit data, specify the encoding submitted. In this case, you need to add a variable to specify the encoding.
$ String = $ _ GET ['charset'] = 'gbk '? Iconv ('gbk', 'utf-8', $ _ GET ['str']): $ _ GET ['str'];
In this case, if there is no agreement or we cannot control the client, it seems that this solution is not very well used.
Solution 2
The server directly detects the received data encoding.
This solution is of course the most ideal. Now the question is how to detect the encoding of a character? In php, mb_check_encoding in the mb_string extension provides the required functions.
$ Str = mb_check_encoding ($ _ GET ['str'], 'gbk ')? Iconv ('gbk', 'utf-8', $ _ GET ['str']): $ _ GET ['str'];
However, you need to enable the mb_string extension. In some cases, this extension may not be enabled in our production server. In this case, you need to use the following functions to determine the encoding.
The following functions are not written by myself:Copy codeThe Code is as follows: function isGb2312 ($ string ){
For ($ id = 0; $ I 127 ){
If ($ v> = 228) & ($ v <= 233 ))
{
If ($ I + 2) >=( strlen ($ string)-1) return true;
$ V1 = ord ($ string [$ I + 1]);
$ V2 = ord ($ string [$ I + 2]);
If ($ v1> = 128) & ($ v1 <= 191) & ($ v2> = 128) & ($ v2 <= 191 ))
Return false;
Else
Return true;
}
}
}
Return true;
}
Function isUtf8 ($ string ){
Return preg_match ('% ^ (? :
[\ X09 \ x0A \ x0D \ x20-\ x7E] # ASCII
| [\ XC2-\ xDF] [\ x80-\ xBF] # non-overlong 2-byte
| \ XE0 [\ xA0-\ xBF] [\ x80-\ xBF] # excluding overlongs
| [\ XE1-\ xEC \ xEE \ xEF] [\ x80-\ xBF] {2} # straight 3-byte
| \ XED [\ x80-\ x9F] [\ x80-\ xBF] # excluding surrogates
| \ XF0 [\ x90-\ xBF] [\ x80-\ xBF] {2} # planes 1-3
| [\ XF1-\ xF3] [\ x80-\ xBF] {3} # planes 4-15
| \ XF4 [\ x80-\ x8F] [\ x80-\ xBF] {2} # plane 16
) * $ % Xs ', $ string );
}
Here we can use any of the above functions to implement encoding detection. And convert it to the specified encoding.
$ Str = isGb2312 ($ _ GET ['str'], 'gbk ')? Iconv ('gbk', 'utf-8', $ _ GET ['str']): $ _ GET ['str'];