Php string processing-full-width half-width conversion, php string full-width half-width conversion. Php string processing: full-width and-half-width Conversion. php string full-width and-half-width processing is a common problem in string processing. This article attempts to provide you with an idea. I. full concept php string processing-full-width half-width conversion, full-width half-width php string
Full-width processing is a common problem in string processing. This article attempts to provide you with an idea.
I. concepts
Full-width unicode encoding from 65281 ~ 65374 (hexadecimal 0xFF01 ~ 0xFF5E)
The unicode encoding of halfwidth characters ranges from 33 ~ 126 (hexadecimal 0x21 ~ 0x7E)
Space is special. The full angle is 12288 (0x3000), and the half angle is 32 (0x20)
Besides spaces, the full/half-width values are sorted in unicode encoding order.
Therefore, you can use the +-method to process non-space data and separate the space.
II. implementation ideas
1. find the target unicode character, which can be solved using a regular expression
2. modify unicode encoding
III. implementation
1. there are two unicode and character conversion functions:
1/*** convert unicode to character 3 * @ param int $ unicode 4 * @ return string UTF-8 character 5 */6 function unicode2Char ($ unicode) {7 if ($ unicode <128) return chr ($ unicode); 8 if ($ unicode <2048) return chr ($ unicode> 6) + 192 ). 9 chr ($ unicode & 63) + 128); 10 if ($ unicode <65536) return chr ($ unicode> 12) + 224 ). 11 chr ($ unicode> 6) & 63) + 128 ). 12 chr ($ unicode & 63) + 128); 13 if ($ unicode <2097152) r Eturn chr ($ unicode> 18) + 240 ). 14 chr ($ unicode> 12) & 63) + 128 ). 15 chr ($ unicode> 6) & 63) + 128 ). 16 chr ($ unicode & 63) + 128); 17 return false; 18} 19 20/** 21 * convert the character to unicode22 * @ param string $ char must be a UTF-8 character 23 * @ return int24 **/25 function char2Unicode ($ char) {26 switch (strlen ($ char) {27 case 1: return ord ($ char); 28 case 2: return (ord ($ char {1}) & 63) | 29 (ord ($ char {0}) & 31) <6); 30 case 3: return (ord ($ char {2}) & 63) | 31 (ord ($ char {1}) & 63) <6) | 32 (ord ($ char {0}) & 15) <12); 33 case 4: return (ord ($ char {3 }) & 63) | 34 (ord ($ char {2}) & 63) <6) | 35 (ord ($ char {1}) & 63) <12) | 36 (ord ($ char {0}) & 7) <18); 37 default: 38 trigger_error ('character is not UTF-8! ', E_USER_WARNING); 39 return false; 40} 41}
2. Full-angle to half-angle
1/** 2 * fullwidth to halfwidth 3 * @ param string $ str 4 * @ return string 5 **/6 function sbc2Dbc ($ str) {7 return preg_replace (8 // full-width character 9'/[\ x {3000} \ x {ff01}-\ x {ff5f}]/ue ', 10 // code conversion 11 // 0x3000 is a space, special processing, other full-width character encoding-0xfee0 can be converted to a half-width 12' ($ unicode = char2Unicode (\ '\ 0 \') = 0x3000? "": ($ Code = $ unicode-0xfee0)> 256? Unicode2Char ($ code): chr ($ code) ', 13 $ str14); 15}
3. halfwidth to fullwidth
1/** 2 * halfwidth to fullwidth 3 * @ param string $ str 4 * @ return string 5 **/6 function dbc2Sbc ($ str) {7 return preg_replace (8 // halfwidth character 9'/[\ x {0020} \ x {0020}-\ x {7e}]/ue ', 10 // code conversion 11 // 0x0020 is a space, special processing, other half-width character encoding + 0xfee0 can be converted to full-width 12' ($ unicode = char2Unicode (\ '\ 0 \') = 0x0020? Unicode2Char (0x3000): ($ code = $ unicode + 0xfee0)> 256? Unicode2Char ($ code): chr ($ code) ', 13 $ str14); 15}
IV. test
Sample code:
1 $a = 'abc12 345';2 $sbc = dbc2Sbc($a);3 $dbc = sbc2Dbc($sbc);4 5 var_dump($a, $sbc, $dbc);
Result:
1 string(9) "abc12 345"2 string(27) "abc12 345"3 string(9) "abc12 345"
String Processing is a common problem. This article attempts to provide you with an idea. I. full concept...