: This article mainly introduces the HTML-ENTITIES encoding, for PHP tutorials interested in students can refer. When capturing a web page with fabpot/goutte (https://github.com/FriendsOfPHP/Goutte), it is found that no matter what encoding the target page is (gb2312...), the final result is unicode.
It is found that the crawler of Symfony calls the html-entities encoding.
Mb_convert_encoding ($ content, 'HTML-ENTITIES ', $ charset );
Then, the wiki encyclopedia has a basic knowledge... The html-entities encoding uses unicode (http://en.wikipedia.org/wiki/Character_encodings_in_HTML ).
Reference
A numeric character reference in HTML refers to a character by its Universal Character Set/Unicode code point
It is hereby recorded.
The above introduces the HTML-ENTITIES encoding, including the content of the aspect, hope to be interested in PHP Tutorial friends help.