Convert Chinese characters to Unicode method
code is as follows |
|
<?php //convert UTF8 encoded kanji to Unicode function Htou ($c) { $n = (ord ($c [0]) & 0x1f) << 12; $n = (Ord ($c [1]) & 0x3f) << 6; $n = Ord ($c [2]) & 0x3f; return $n; } //Hide UTF8 format string in code function My_utf8_unicode ($str) { $encode = '; for ($i =0; $i < Strlen ($STR); $i) { if (Ord (substr ($str, $i, 1)) > 0xa0) { $encode. = ' &# '. Htou (substr ($ STR, $i, 3)). '; '; $i = 2; }else{ $encode. = ' &# '. Ord ($str [$i]). } } return $encode; } Echo my_utf8_unicode (haha abc); |
Conversion of Chinese characters into Unicode method two
The code is as follows |
|
function Getunicode ($word) { //to UTF8 $word 0 = iconv (' GBK ', ' utf-8 ', $word); $word 1 = iconv (' UT F-8 ', ' GBK ', $word 0); $word = ($word 1 = $word)? $word 0: $word; //split Chinese characters preg_match_all (?: [x00-x7f]|[ xc0-xff][x80-xbf]+) #s ', $word, $array, Preg_pattern_order); $return = array (); //Conversion foreach ($array [0] as $CC) { $arr = Str_split ($CC); $bin _str = '; foreach ($arr as $value) { $bin _str. = Decbin (ord ($value)); &NBSP;&NBSP} $bin _str = preg_replace ('/^.{ 4} (. { 4}). {2} (. {6}). {2} (. {6}) $/', ' $1$2$3 ', $bin _str); $return [] = ' &# '. Bindec ($bin _str). ';'; } return implode (', $return); } |
Function usage:
The code is as follows |
|
$word = ' Converts a Chinese character into a Unicode four-byte encoded PHP function. '; echo Getunicode ($word); |
The above results will be output as follows:
& #19968 & #20010 & #27721 & #23383 & #36716 & #25442 & #25104 & #65333 & #65358
& #65353 & #65347 & #65359 & #65348 & #65349 & #22235 & #23383 & #33410 & #32534
& #30721 & #30340 & #80 & #72 & #80 & #20989 & #25968 & #12290
This set of functions converts Chinese characters into Unicode encoding, or it can decode Unicode into Chinese characters.
function to convert Chinese characters to Unicode:
The code is as follows |
|
function Uni_encode ($word) { $word 0 = iconv (' GBK ', ' utf-8 ', $word); $word 1 = iconv (' utf-8 ', ' GBK ', $word 0); $word = ($word 1 = $word)? $word 0: $word; $word = Json_encode ($word); $word = Preg_replace_callback ('/\\u (w{4})/', create_function (' $hex ', ' return ' &# '. Hexdec ($hex [1]). '; '; ', substr ($word, 1, strlen ($word)-2)); return $word; } |
Functions to decode Unicode encoding:
code is as follows |
|
function Uni_decode ($uncode) { $word = Json_decode (Preg_replace_callback ('/&# (d{5});/', create _function (' $dec ', ' return ' \u '. Dechex ($dec [1]); '), ' "'. $uncode. '"); return $word; } |