Phpsimpledomhtml parsing garbled characters

Source: Internet
Author: User
: This article mainly introduces phpsimpledomhtml parsing garbled characters. For more information about PHP tutorials, see. 1. garbled solution

Without a doubt, a garbled problem occurs as soon as it comes up, although I have followed the instructions in the document, all the characters are encoded using UTF-8:

$ Html ='

Hi!

'; $ Dom = new DOMDocument (); @ $ dom-> loadHTML ($ html); echo $ dom-> documentElement-> nodeValue;

However, if it is changed:

$ Html ='

Hi!

'; $ Dom = new DOMDocument (); @ $ dom-> loadXML ($ html); echo $ dom-> documentElement-> nodeValue;

No problem. later I discovered that loadHTML relied on the meta tag declared in HTML. if there is no such label, it is regarded as the ISO-8859-1 character set, so garbled. to solve this problem, add such a label to the string in the header:

$meta = '
 '; @$dom->loadHTML($meta . $html);

2. recursion

HTML/XML is a recursive layout, so recursive traversal is inevitable:

Function _ pretty_html_node ($ node) {// recursive termination prerequisite // 1. XML_TEXT_NODE // 2. XML_ELEMENT_NODE // 3. no subnode foreach ($ node-> childNodes as $ n) {$ child_text. = _ pretty_html_node ($ n);} // then perform different treatments for different labels. switch ($ tag) {case 'a ': $ href = $ node-> getAttribute ('href '); $ text. = "$ child_text ";...} return $ text ;}

3. penalty for handling escape characters

For a text node, its nodeValue must end with the htmlspeciachars () escape. because the text will be reversed when the HTML/XML is read, for example,> already in memory>.

Download source code: pretty_html.php

Related posts:

  1. C # SimpleXML
  2. Web page garbled during self-setup of Apache server
  3. If-else is disgusted with optimization code redundancy
  4. Wordpress paging code
  5. Use Javascript to generate a pop-up window

The above introduces php simple dom html parsing garbled characters, including the content, hope to be helpful to friends who are interested in PHP tutorials.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.