The fifth part of the Chinese character encoding study series describes the encoding principles of the urlencode () and urldecode () functions. the two functions are used to encode the URL string and decode the encoded URL string respectively, the principle of encoding Chinese characters is to convert Chinese characters into hexadecimal notation and combine strings according to certain rules to implement character encoding and decoding, ensure character integrity and compatibility during URL data transmission, and mainly discuss Chinese character encoding.
1. Chinese characters are encoded in FireFox
If you enter Chinese characters in the Firefox browser, the URL is automatically encoded as follows:
Before pressing Enter
Press Enter
II. urlencode () function principle
The urlencode () function is used to encode a URL string. here we mainly discuss the encoding of Chinese characters,
The instance is as follows:
The code is as follows:
Echo urlencode ('Do not be infatuated with go'); // output: % B2 % BB % D2 % AA % C3 % D4 % C1 % B5 % B8 % E7
Urlencode () function principle is to first convert Chinese characters to hexadecimal notation, and then add an identifier % before each character to understand this principle and implement a custom URL encoding function, the code is as follows:
The code is as follows:
$ String = "do not be infatuated with brother ";
$ Length = strlen ($ string );
Echo $ string;
$ Result = array ();
// Decimal
For ($ I = 0; $ I <$ length; $ I ++ ){
If (ord ($ string [$ I])> 127 ){
$ Result [] = ord ($ string [$ I]). ''. ord ($ string [++ $ I]);
}
}
Var_dump ($ result );
// Hexadecimal
$ Strings = array ();
Foreach ($ result as $ v ){
$ Dec = explode ("", $ v );
$ Strings [] = "%". dechex ($ dec [0]). "". "%". dechex ($ dec [1]);
}
Var_dump ($ strings );
The above code is described in detail in the analysis section on the principle of conversion from Chinese characters to hexadecimal characters in [analysis on the principle of conversion from Chinese characters to hexadecimal characters in PHP, the urlencode () function is implemented by obtaining the characters of a Chinese character and then converting them to hexadecimal notation, and adding a special identifier % before each character. the output result is as follows:
Then compare the output result with the characters directly encoded using urlencode (), such as: % B2 % BB % D2 % AA % C3 % D4 % C1 % B5 % B8 % E7
The above example shows that using the urlencode () function to encode Chinese characters is essentially converting the characters into hexadecimal notation and then adding a special identifier % to the left of the first character.
3. urldecode () function principle
Use the urldecode () function to decode the encoded URL string. The example is as follows:
Echo urldecode ('% B2 % BB % D2 % AA % C3 % D4 % C1 % B5 % B8 % E7'); // output: Do not be infatuated with your brother
The urldecode () function is opposite to the urlencode () function. it is used to decode the encoded URL string. The principle is to convert the hexadecimal string to a Chinese character, you can also implement custom function decoding strings.
The code is as follows:
$ String = '% B2 % BB % D2 % AA % C3 % D4 % C1 % B5 % B8 % E7 ';
$ Length = strlen ($ string );
$ Hexs = array ();
For ($ I = 0; $ I <$ length; $ I ++ ){
If ($ string [$ I] = '% '){
$ Hexs [] = $ string [++ $ I]. $ string [++ $ I];
}
}
$ Num = count ($ hexs );
For ($ I = 0; $ I <$ num; $ I ++ ){
Echo chr (hexdec ($ hexs [$ I]). chr (hexdec ($ hexs [++ $ I]);
}
The above example code first extracts the hexadecimal notation of each character according to the string rules, then converts the hexadecimal notation to the decimal using the hexdec () function, and then uses chr () the function converts decimal to a character and hexadecimal to a character. The output result is as follows:
IV. urldecode () and urlencode () functions
Urlencode
(PHP 3, PHP 4, PHP 5)
Urlencode -- encode a URL string
Description
String urlencode (string str)
Returns a string -_. all other non-alphanumeric characters will be replaced with a semicolon (%) followed by two hexadecimal numbers, and spaces will be encoded as the plus sign (+ ). This encoding method is the same as that for WWW form POST data and the same as that for application/x-www-form-urlencoded. For historical reasons, this encoding is different from RFC1738 encoding (see rawurlencode () in space encoding as the plus sign (+. This function allows you to encode a string and use it in the request part of a URL. It also allows you to pass variables to the next page.
Urldecode
(PHP 3, PHP 4, PHP 5)
Urldecode -- decode the encoded URL string
Description
String urldecode (string str)
Decodes any % # in the encoded string ##. Returns the decoded string.
5. Reference Resources
Urlencode () description
Urldecode () description