How to tell what the file_get_contents data is encoded
Source: Internet
Author: User
How to judge what the file_get_contents data is encoded
Title
------Solution--------------------
EUC-CN is the most commonly used representation of GB 2312. The "GB2312" on the browser encoding table is usually referred to as "EUC-CN" notation.
But you can't get the right results with mb_detect_encoding.
You take a look at the coded position in the list "Gb2312,gbk,utf-8".
------Solution--------------------
Encoding determines the content, content cannot determine the encoding,
For example, see a GBK kanji, can also be seen as two iso-8859-1 characters
So mb_detect_encoding can only guess, since it is a guess, can not fully guarantee the right (in fact, there is no absolute right)
What you get for file_get_contents, if it's a Web page, can be judged by meta,
If there is no such information, only "guess"
------Solution--------------------
Explore
Encoding determines the content, content cannot determine the encoding,
For example, see a GBK kanji, can also be seen as two iso-8859-1 characters
So mb_detect_encoding can only guess, since it is a guess, can not fully guarantee the right (in fact, there is no absolute right)
What you get for file_get_contents, if it's a Web page, can be judged by meta,
If there is no such information, only "guess"
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.