The principle is very simple, because gb2312/gbk is a Chinese byte, the two bytes have a value range, while the Chinese character in UTF-8 is three bytes, and each byte also has a value range. English, regardless of the encoding, is less than 128, only occupies one byte (excluding the full width)
When PHP processes the page, we use iconv or mb_convert functions for character set conversion. However, this is actually a prerequisite. That is, we must know in and out encoding in advance before we c
encoding conversion between gb2312 and UnicodeThe following example is to convert gb2312 to "whole" in this formphp4.3.1 later Iconv function is very useful, just need to write a uft8 to Unicode conversion functionTabular (gb2312.txt) is fine.
Copy Code code as follows:
?
$text = "cloud-dwelling community";
Preg_match_all ("/[\x80-\xff]?" /", $text, $ar);
foreach ($ar [0] as $v)
echo "#". Utf8_unicode (Iconv ("GB2312", "UTF-8", $v)). ";
?>
?
UTF8-> Unicode
function
Coding | conversion
encoding conversion between gb2312 and Unicode
The following example is to convert gb2312 to "whole" in this form
php4.3.1 later Iconv function is very useful, just need to write a uft8 to Unicode conversion functionTabular (gb2312.txt) is fine.?$text = "electronic stacks";Preg_match_all ("/[\x80-\xff]?" /", $text, $ar);foreach ($ar [0] as $v)echo "#". Utf8_unicode (Iconv ("GB2312", "UTF-8", $v)). ";?>?UTF8-> Unicodefunction Utf8_unicode ($c) {Switch (strlen ($c)) {Case 1:Ret
About the BIG5-HKSCS solution. It is very hard to find that PHP has always supported the problem of HKSCS which has been difficult for a long time. But not the HK-SCS, the BIG5-HKSCS. The following is a solution to the HK increment character set: the HTML dataset was found to be very bitter, and PHP has always supported the problem of HKSCS which has been difficult to understand. But not the HK-SCS, the BIG5-HKSCS.
The following is a solution to solve the HK increment character set:
The HTML fac
Use php to convert the codes between gb2312 and unicode. The following example shows how to convert gb2312 from unicode to iconv functions after php4.3.1, you only need to write a uft8-to-unicode conversion code between gb2312 and unicode.
The following example converts gb2312 to "full ".
The iconv function after php4.3.1 is very useful, but you only need to write a conversion function from uft8 to unicode.
Check the table (gb2312.txt ).
The code is as follows:
$ Text = "";Preg_match_all ("
I. Analysis of conversion principle from Chinese characters to decimal characters A Chinese Character in GBK encoding consists of two characters. The method for obtaining a Chinese character string is as follows:Copy codeThe Code is as follows:$ String = "Do not be infatuated with Brother ";$ Length = strlen ($ string );For ($ I = 0; $ I If (ord ($ string [$ I])> 127 ){$ Result [] = ord ($ string [$ I]). ''
English eg. I i i j = wiwj* eg.* $py = new str2py ();* $result = $py->getinitials (' Ah, just the hungry fly just fine I saw you oh flat to people is he UV I want to one in ');*/Class Str2py{Private $_pinyins = Array (176161 = ' A ',176197 = ' B ',178193 = ' C ',180238 = ' D ',182234 = ' E ',183162 = ' F ',184193 = ' G ',185254 = ' H ',187247 = ' J ',191166 = ' K ',192172 = ' L ',194232 = ' M ',196195 = ' N ',197182 = ' O ',197190 = ' P ',198218 = ' Q ',200187 = ' R ',200246 = ' S ',203250 = '
an analysis of the principle of Chinese character to decimal
In GBK encoding, a Chinese character consists of two characters, and the method of acquiring Chinese character string is as follows
Copy CodeThe code is as follows:
$string = "Don't be infatuated with elder brother";
$length = strlen ($string);
for ($i =0; $i if (Ord ($string [$i]) >127) {
$result [] = Ord ($string [$i]). ' '.
=> ' A ',176197 => ' B ',178193 => ' C ',180238 => ' D ',182234 => ' E ',183162 => ' F ',184193 => ' G ',185254 => ' H ',187247 => ' J ',191166 => ' K ',192172 => ' L ',194232 => ' M ',196195 => ' N ',197182 => ' O ',197190 => ' P ',198218 => ' Q ',200187 => ' R ',200246 => ' S ',203250 => ' T ',205218 => ' W ',206244 => ' X ',209185 => ' Y ',212209 => ' Z ',);Private $_charset = null;/*** constructor, specifying the encoding required Default:utf-8* Support Utf-8, gb2312** @param unknown_type $
|+------------+-------------------+-------------------+1 row in Set (0.00 sec)The */ord () function returns the numeric encoding of the specified character, often used in place of ASCII ():Well, how I feel. The return value is exactly the same as ASCII (). PHP also has this function, seemingly.Select Ord (' Y '), ord (' Simaopig '),
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.