DOM4J cannot save the XML file as UTF-8, Invalid byte 2 of 2-byte UTF-8 Sequence-hxzon hands-gdo
These days began to learn dom4j, on the Internet to find an article on the dry, very fast, but found a problem is not to UTF-8 save t
PHP character encoding conversion class,
support for ANSI, Unicode, Unicode big endian, UTF-8, Utf-8+bom to convert each other.
Four common text file encoding methods
ANSI Code:
No file header (file encoding at the beginning of the symbolic byte)
ANSI encoded alphanumeric account of one byte, Chinese characters accou
UTF-32 stores each character in 4 bytes to ensure that the UCS is fully represented. However, the number of characters in the UCS does not need to be represented by 32 bits at all, UTF-32 greatly wasted space. In addition, because of the combination of characters, the fixed length is not as fast as expected to locate characters, anyway, is super bad.UTF-16 maps the UCS to a 16-long integer for data storage
: This article mainly introduces the php character conversion class, support ANSI, Unicode, Unicodebigendian, UTF-8, UTF-8 + Bom mutual conversion, for PHP tutorials interested in students can refer to it. Php character encoding conversion class, supports ANSI, Unicode, Unicode big endian,
PHP page, MySQL database turn utf-8 garbled, Utf-8 coding problem Summary, Mysqlutf-8
Example one:
PHP page to UTF-8 encoding problem
1. Add a line at the beginning of the code: header
[Conversion] a Chinese character of UTF-8 occupies three bytes of length and UTF-8 bytes.
The answer from Baidu is more vivid and impressive, so I will take a note.
Original link https://zhidao.baidu.com/question/1047887004693001899.html
Zhi Hu also has a clearer answer to https://www.zhihu.com/question/23374078
1. Am
Example one:
PHP page UTF-8 encoding problem
1. In the code begins to add a line: Header ("Content-type:text/html;charset=utf-8");
2.PHP file Encoding Problem click the Editor's menu: "File"-> "Save As", you can see the current file encoding to ensure that the file encoding is:
multiple encoding methods in the world. The same binary number can be interpreted as different symbols. Therefore, to open a text file, you must know its encoding method. Otherwise, garbled characters may occur when you use an incorrect encoding method. Why do emails often contain garbled characters? It is because the sender and receiver use different encoding methods.As you can imagine, if there is an encoding, all the symbols in the world will be included. Every symbol is given a unique encod
If a website needs to be internationalized, it needs to convert the code from GB2312 to a UTF-8, there are many problems to note, if not completely converted, there will be a lot of coding problems! Next, I will share with you the php page through this article. mysql database is converted to UTF-8 garbled characters. e
When using IE as a browser on the Windows operating system. The problem often occurs when browsing a Web page that uses UTF-8 encoding, which is not automatically detected by the browser (that is, when the Automatically select encoding format is not set) the encoding used for the page.
Even if the page has been declared in the encoding format:
This causes some pages containing Chinese
Example One:
PHP page to UTF-8 encoding problem
1. Add a line at the beginning of the code: header ("Content-type:text/html;charset=utf-8");
2.PHP file Encoding Problems Click the Editor Menu: "File", "Save As", you can see the current file encoding, to ensure that the file encoding is:
The GBK version of the common build program is the same as the UTF-8 version feature. It's just different coding methods.
GBK's text encoding is expressed in two-byte notation, that is, both Chinese and English characters use double-byte notation, except that the highest bits are set to 1 for distinguishing Chinese.
As for the UTF-
http://blog.csdn.net/thl789/article/details/7506133Https://zhuanlan.zhihu.com/p/23654187?refer=dreawerHttp://www.ruanyifeng.com/blog/2007/10/ascii_unicode_and_utf-8.html
UTF-8UTF-8 (8-bit Unicode Transformation Format) is a variable-length character encoding for Unicode that encodes each character with one to four byt
filled in sequentially, and the extra bits complement 0. This gets, "strict" UTF-8 code is "11100100 10111000 10100101", converted into 16 binary is e4b8a5.6. Conversion between Unicode and UTF-8Using the example in the previous section, you can see that the Unicode code for "strict" is 4e25,utf-
Transferred from: https://www.cnblogs.com/kclteam/p/5278926.htmlThe new project, presumably the situation is this: there may be many countries, different language users, the analogy of Chinese, traditional Chinese, Korean, Japanese and so on, the development of the choice of UTF-8 coding, development smooth, no problem. Yesterday did a CSV export function, the export of something completely chaotic:Set mb_c
UTF-8 Regular Expression how to match Chinese characters, UTF-8 Regular Expression
Check the following code to determine whether the entered content contains illegal characters:
$ Str = "programming"; // if (! Preg_match ("/^ [\ x {4e00}-\ x {9fa5} A-Za-z0-9 _] + $/u", $ str )) //
Mysql in the utf8_general_ci is corresponding to the php document UTF-8 is the file format of UTF-8 without BOM encoding? Mysql in the utf8_general_ci is corresponding to the php document UTF-8 is the file format of
CP936 to UTF-8, cp936 to UTF-8
Recently I wrote a script to capture most of the content normally, but a small part of it is garbled.
Check the character encoding and the result is CP936.
Mb_detect_encoding ($ str, 'gbk, gb2312, GB18030, ISO-8859-1, ASCII, UTF-
[Turn] UTF-8 Chinese Character regular expression, UTF-8 Chinese Character Regular Expression
Link: http://blog.csdn.net/wide288/article/details/30066639
$ Str = "programming ";// If (! Preg_match ("/^ [\ x {4e00}-\ x {9fa5} A-Za-z0-9 _] + $/u", $ str) // UTF-
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.