Analysis on the Conversion principle of Chinese Characters in PHP

Source: Internet
Author: User
Tags ultraedit

I. Analysis of conversion principle from Chinese characters to decimal characters

A Chinese Character in GBK encoding consists of two characters. The method for obtaining a Chinese character string is as follows:
Copy codeThe Code is as follows:
$ String = "Do not be infatuated with Brother ";
$ Length = strlen ($ string );
For ($ I = 0; $ I <$ length; $ I ++ ){
If (ord ($ string [$ I])> 127 ){
$ Result [] = ord ($ string [$ I]). ''. ord ($ string [++ $ I]);
}
}
Var_dump ($ result );


Because a Chinese character is composed of two characters, if the ASCII value of the character is obtained through the ord () function, if it is greater than 127, you can determine that the current character is the first half of a Chinese character, you also need to obtain the second half of the Chinese character. Of course, this method should be combined with the specific development environment. If there is a single character with an ASCII value greater than 127, this method is obviously incorrect.

The principle of converting Chinese characters to decimal in PHP is to use the for loop method to obtain two characters of a Chinese character, and then use the ord () function to convert each character to decimal. The above are: Do not [178 187] to [210 170 195] Fans [212 193] Love [181 184] Brother [231]

Ii. Analysis on the principle of converting Chinese characters to hexadecimal notation

You can use the UltraEdit development tool to directly view the hexadecimal format of Chinese characters, as shown in figure

For example, check the hexadecimal format of the five words "don't be infatuated with brother ".

From the above figure, we can see that the hexadecimal characters for each Chinese Character pair are: no B2BB wants D2AA fans C3D4 loves C1B5 brother B8E7

The principle of converting Chinese characters to hexadecimal in PHP is to first use the ord () function to retrieve the decimal digits of each Chinese character. For details, refer to [PHP functions> ord () and chr () function Application], and then convert each Chinese character to hexadecimal using the dechex () function

Instance source code
Copy codeThe Code is as follows:
$ String = "Do not be infatuated with Brother ";
$ Length = strlen ($ string );
Echo $ string;
$ Result = array ();
// Decimal
For ($ I = 0; $ I <$ length; $ I ++ ){
If (ord ($ string [$ I])> 127 ){
$ Result [] = ord ($ string [$ I]). ''. ord ($ string [++ $ I]);
}
}
Var_dump ($ result );
// Hexadecimal
$ Strings = array ();
Foreach ($ result as $ v ){
$ Dec = explode ("", $ v );
$ Strings [] = dechex ($ dec [0]). "". dechex ($ dec [1]);
}
Var_dump ($ strings );

The result is as follows:

The above method converts Chinese characters to hexadecimal. The output result can be compared with the hexadecimal format obtained by using the UltraEdit development tool.

Iii. Analysis of the conversion principle from Chinese characters to binary and octal characters

The implementation of the conversion of Chinese characters to binary and octal is the same as the above hexadecimal conversion principle, but the conversion function is different. Combined with the above example code, the implementation is as follows:

Convert a Chinese character to a binary string using the following method:
Copy codeThe Code is as follows:
$ Strings = array ();
Foreach ($ result as $ v ){
$ Dec = explode ("", $ v );
$ Strings [] = decbin ($ dec [0]). "". decbin ($ dec [1]);
}
Var_dump ($ strings );

The result is as follows:

Convert Chinese characters to octal characters using the following method:
Copy codeThe Code is as follows:
$ Strings = array ();
Foreach ($ result as $ v ){
$ Dec = explode ("", $ v );
$ Strings [] = decoct ($ dec [0]). "". decoct ($ dec [1]);
}

The result is as follows:

Understand the principle of converting Chinese Characters in PHP, and then use the PHP built-in function urldecode () to convert hexadecimal strings into normal Chinese characters by combining them, please pay attention to the urldecode () and urlencode () function character encoding principles in the next series of Chinese character encoding studies.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.