- // The automatic conversion character set supports array conversion.
- Function auto_charset ($ fContents, $ from = 'gbk', $ to = 'utf-8 '){
- $ From = strtoupper ($ from) = 'utf8 '? 'Utf-8': $ from;
- $ To = strtoupper ($ to) = 'utf8 '? 'Utf-8': $;
- If (strtoupper ($ from) === strtoupper ($ to) | empty ($ fContents) | (is_scalar ($ fContents )&&! Is_string ($ fContents ))){
- // If the encoding is the same or the non-string scalar, the conversion is not performed.
- Return $ fContents;
- }
- If (is_string ($ fContents )){
- If (function_exists ('MB _ convert_encoding ')){
- Return mb_convert_encoding ($ fContents, $ to, $ from );
- } Elseif (function_exists ('iconv ')){
- Return iconv ($ from, $ to, $ fContents );
- } Else {
- Return $ fContents;
- }
- } Elseif (is_array ($ fContents )){
- Foreach ($ fContents as $ key => $ val ){
- $ _ Key = auto_charset ($ key, $ from, $ );
- $ FContents [$ _ key] = auto_charset ($ val, $ from, $ );
- If ($ key! = $ _ Key)
- Unset ($ fContents [$ key]);
- }
- Return $ fContents;
- }
- Else {
- Return $ fContents;
- }
- }
At this time, you may want to directly use iconv for transcoding. However, the iconv function requires two parameters: input encoding and output encoding, but now you do not know what encoding the accepted string is, it would be nice to get the encoding of the received characters. There are two solutions for this problem. Solution 1When you want the client to submit data, specify the encoding submitted. in this case, you need to add a variable to specify the encoding. $ String = $ _ GET ['charset'] = 'gbk '? Iconv ('gbk', 'utf-8', $ _ GET ['str']): $ _ GET ['str']; in this case, if there is no agreement or we cannot control the client, it seems that this solution is not very useful. Solution 2The server directly detects the received data encoding. This solution is of course the most ideal. Now the question is how to detect the encoding of a character? In php, mb_check_encoding in the mb_string extension provides the required functions. $ Str = mb_check_encoding ($ _ GET ['str'], 'gbk ')? Iconv ('gbk', 'utf-8', $ _ GET ['str']): $ _ GET ['str']; but you need to enable the mb_string extension, sometimes this extension may not be enabled on our production server. In this case, you need to use the following functions to determine the encoding.
- Function isGb2312 ($ string ){
- For ($ ID = 0; $ I 127 ){
- If ($ v> = 228) & ($ v <= 233 ))
- {
- If ($ I + 2) >=( strlen ($ string)-1) return true;
- $ V1 = ord ($ string [$ I + 1]);
- $ V2 = ord ($ string [$ I + 2]);
- If ($ v1> = 128) & ($ v1 <= 191) & ($ v2> = 128) & ($ v2 <= 191 ))
- Return false;
- Else
- Return true;
- }
- }
- }
- Return true;
- }
- Function isUtf8 ($ string ){
- Return preg_match ('% ^ (? :
- [\ X09 \ x0A \ x0D \ x20-\ x7E] # ASCII
- | [\ XC2-\ xDF] [\ x80-\ xBF] # non-overlong 2-byte
- | \ XE0 [\ xA0-\ xBF] [\ x80-\ xBF] # excluding overlongs
- | [\ XE1-\ xEC \ xEE \ xEF] [\ x80-\ xBF] {2} # straight 3-byte
- | \ XED [\ x80-\ x9F] [\ x80-\ xBF] # excluding surrogates
- | \ XF0 [\ x90-\ xBF] [\ x80-\ xBF] {2} # planes 1-3
- | [\ XF1-\ xF3] [\ x80-\ xBF] {3} # planes 4-15
- | \ XF4 [\ x80-\ x8F] [\ x80-\ xBF] {2} # plane 16
- ) * $ % Xs ', $ string );
- }
Here we can use any of the above functions to implement encoding detection and convert it to the specified encoding. $ Str = isGb2312 ($ _ GET ['str'], 'gbk ')? Iconv ('gbk', 'utf-8', $ _ GET ['str']): $ _ GET ['str']; |