PHP character encoding conversion class, support for ANSI, Unicode, Unicode big endian, UTF-8, Utf-8+bom to convert each other.
Four common text file encoding methods
ANSI Code:
No file header (file encoding at the beginning of the symbolic byte)
ANSI encoded alphanumeric account of one byte, Chinese characters accounted for two bytes
Carriage return line break, single byte, hexadecimal representation 0d 0a
Unicode encoding:
File header, hexadecimal representation of FF FE
Each character is encoded in two bytes
Carriage return newline character, double byte, hexadecimal representation of 000d 000a
Unicode Big Endian Code:
File header hexadecimal representation for FE FF
The code behind is to put the high position of the character in front, the low position in the back, exactly and the Unicode encoding upside down
Carriage return newline character, double byte, hexadecimal represented as 0d00 0a00
UTF-8 Code:
File header, hexadecimal representation for EF BB BF
UTF-8 is a variable-length character encoding for Unicode, where numbers, letters, carriage returns, and line feeds are represented by a byte, and kanji accounts for 3 bytes
Carriage return line break, single byte, hexadecimal representation 0d 0a
Conversion principle: First the character encoding to UTF-8, and then converted from UTF-8 to the corresponding character encoding.
CharsetConv.class.php
_allow_charset) {$this->_in_charset = $in _charset; }//Check output encoding if (In_array ($out _charset, $this->_allow_charset)) {$this->_out_charset = $out _c Harset; "}}/** Convert * @param string $str strings to be converted * @return String converted * * * Public Function convert ($STR) { $str = $this->convtoutf8 ($STR); First to UTF8 $str = $this->convfromutf8 ($STR); Convert from UTF8 to corresponding code return $STR; /** encode the encoding to UTF-8 * @param string $STR * @return String */Private Function ConvToUtf8 ($STR) {if ( $this->_in_charset== ' Utf-8 ') {//code is already utf-8, no return $str; } switch ($this->_in_charset) {case ' Utf-8bom ': $str = substr ($str, 3); Break Case ' ANSI ': $str = iconv (' GBK ', ' Utf-8//ignore ', $str); Break Case ' Unicode ': $str = iconv (' Utf-16le ', ' Utf-8//ignore ', substr ($STR, 2)); Break Case ' Unicodebe ': $str = iconv (' utf-16be ', ' Utf-8//ignore ', substr ($STR, 2)); Break Default:break; } return $STR; }/** converts the UTF-8 encoding to the output encoding * @param string $STR * @return String */Private Function ConvFromUtf8 ($STR) { if ($this->_out_charset== ' utf-8 ') {//Output code is already utf-8, do not go to return $str; } switch ($this->_out_charset) {case ' Utf-8bom ': $str = ' \xef\xbb\xbf '. $str; Break Case ' ANSI ': $str = iconv (' UTF-8 ', ' Gbk//ignore ', $str); Break Case ' Unicode ': $str = "\xff\xfe". Iconv (' UTF-8 ', ' Utf-16le//ignore ', $str); Break Case ' Unicodebe ': $str = "\xfe\xff". Iconv (' UTF-8 ', ' Utf-16be//ignore ', $str); Break Default:break; } return $sTr }}//Class end?>
Demo:Unicode big endian to Utf-8+bom
Convert ($STR); File_put_contents (' Response/utf-8bom.txt ', $response, true);? >
Source Download Address: Click to view
The above describes the PHP character encoding conversion class, support ANSI, Unicode, Unicode big endian, UTF-8, Utf-8+bom to each other, including the aspects of the content, I hope that the PHP tutorial interested in a friend helpful.