Half-angle full-width processing is a common problem in string processing, this paper attempts to provide a way for everyone.
First, the concept
Full-width character Unicode encoding from 65281~65374 (hex 0xff01 ~ 0xff5e)
Half-width character Unicode encoding from 33~126 (hex 0x21~ 0x7E)
Space is special, full angle is 12288 (0x3000), half angle is (0x20)
And in addition to empty, the full-width/half-width of Unicode-encoded sorting is corresponding in order
So you can deal with non-whitespace data directly by using +---to handle the whitespace separately.
Second, the realization of ideas
1. Find the character of the target Unicode, you can use regular expressions to resolve
2. Modifying Unicode encoding
Third, the realization
1. The first is the two Unicode and character conversion functions:
1 /**2 * Convert Unicode to characters3 * @param int $unicode4 * @return string UTF-8 characters5 **/6 functionUnicode2char ($unicode){7 if($unicode< 128)return CHR($unicode);8 if($unicode< 2048)return CHR(($unicode>> 6) + 192).9 CHR(($unicode& 63) + 128);Ten if($unicode< 65536)return CHR(($unicode>> 12) + 224). One CHR((($unicode>> 6) & 63) + 128). A CHR(($unicode& 63) + 128); - if($unicode< 2097152)return CHR(($unicode>> 18) + 240). - CHR((($unicode>>) & 63) + 128). the CHR((($unicode>> 6) & 63) + 128). - CHR(($unicode& 63) + 128); - return false; - } + - /** + * Convert characters to Unicode A * @param string $char must be a UTF-8 character at * @return int - **/ - functionChar2unicode ($char){ - Switch(strlen($char)){ - Case1:return Ord($char); - Case2:return(Ord($char{1}) & 63) | in((Ord($char{0}) & << 6); - Case5 |return(Ord($char{2}) & 63) | to((Ord($char{1}) & << 6) | +((Ord($char{0}) & << 12); - Case4:return(Ord($char{3}) & 63) | the((Ord($char{2}) & << 6) | *((Ord($char{1}) & << 12) | $((Ord($char{0}) & 7) << 18);Panax Notoginseng default: - Trigger_error(' Character is not utf-8! ',e_user_warning); the return false; + } A}
2. Full-width turning half angle
1 /**2 * Full angle turning half angle3 * @param string $str4 * @return String5 **/6 functionSBC2DBC ($str){7 return Preg_replace(8 //full-width characters9'/[\x{3000}\x{ff01}-\x{ff5f}]/ue ',Ten //Encoding Conversion One //0x3000 is a space, special handling, other full-width character encoding -0xfee0 can be converted to half-width A' ($unicode =char2unicode (\ ' \0\ ')) = = 0x3000? "": (($code = $unicode -0xfee0) > 256? Unicode2char ($code): Chr ($code)) ', - $str - ); the}
3. Half angle turning full angle
1 /**2 * Half angle turn full angle3 * @param string $str4 * @return String5 **/6 functionDBC2SBC ($str){7 return Preg_replace(8 //half-width characters9'/[\x{0020}\x{0020}-\x{7e}]/ue ',Ten //Encoding Conversion One //0x0020 is a space, special processing, other half-width character encoding +0xfee0 can be converted to full-width A' ($unicode =char2unicode (\ ' \0\ ')) = = 0x0020? Unicode2char (0x3000): (($code = $unicode +0xfee0) > Unicode2char ($code): Chr ($code)) ', - $str - ); the}
Iv. Testing
Example code:
1 $a = ' abc12 345 '; 2 $SBC = DBC2SBC ($a); 3 $dbc = Sbc2dbc ($sbc); 4 5 Var_dump ($a$sbc$dbc);
Results:
1 string (9) "ABC12 345"2string"Abc12 345"3string(9) "ABC12 345"
Full-width half-width conversion of PHP string processing