Ext.: coolcode.cn
A few days ago wrote a normal display page in any character set method, the introduction is very simple, that is, the first 128 characters outside the character set are used in NCR to express, but the specific how to convert I did not introduce, because at that time I think it is too simple. But later found someone to ask this question, here is detailed explanation.
The first step is to convert the string from the source character set to the UTF-16 character set, which is done because each character in the UTF-16 character set is two bytes, which is easy to process, and complex to handle directly on the source character set. The source character set can be obtained from the META tag in the original Web page, or it can be specified separately, my program is to let the user specify the source character set in the form, because I can not guarantee that the user submits the file must be an HTML file (other files are also possible, such as the WordPress Chinese package source file is a PO file , the contents of which can also be processed), and even HTML files do not necessarily have meta tags for specifying character sets, so it is safer to specify the character set separately from the form. You might think that converting a character set to another character set is complicated, and indeed, if you do it, it's really cumbersome, but it's easy to do it with PHP, because it already contains such a function, and you can easily convert between the various character sets by using the Iconv function. If the iconv extension is not installed on your machine, you can also use the mb_convert_encoding function, if the multibyte String extension is not installed, then there is no way, because you have to implement so many of the encoding conversion is basically impossible, Unless you're a top bull! It is recommended to use iconv because of its high efficiency and the number of supported character sets.
After you've done that, the next step is to process the string in every two bytes. These two bytes are converted directly into numbers & #xxxxx; XXXXX, if the number is less than 128, use this character directly (note that it becomes a single byte), or use & #xxxxx; The form. One thing to note here is that when this number is 65279 (16 binary 0xFEFF), ignore it because this is a Transmission control character in Unicode encoding, and we now have a string that has only the first 128 characters in the Iso-8859-1 encoding. So we don't need it anymore.
Well, the basic idea is this, the following is the implementation of the program:
Download: nochaoscode.php
Copy the Code code as follows:
function Nochaoscode ($encode, $str) {
$str = Iconv ($encode, "Utf-16be", $str);
for ($i = 0; $i < strlen ($STR); $i + +, $i + +) {
$code = Ord ($str {$i}) * + + ord ($str {$i + 1});
if ($code < 128) {
$output. = Chr ($code);
} else if ($code! = 65279) {
$output. = "the". $code. ";";
}
}
return $output;
}
?>
In the parameters of the function, $encode is the source character set, $str is the string that needs to be converted. The returned result is a string after conversion.
Add: Today Legend told me a simpler way to use the mb_convert_encoding function directly. Because mb_convert_encoding supports a coding format called Html-entities, which is the NCR code. It's easier to use it.
The effect and function of Ganoderma lucidum spore powder and the method of edible method in the normal display of the webpage under any character set are described in the above two (continued), including the effect of Ganoderma lucidum spore powder and the content of the edible methods, I hope that the PHP tutorial interested friends have helped.