Use the phphtml parser SimpleHTMLDom-PHP source code

Source: Internet
Author: User
This article will introduce how to use SimpleHTMLDom parser. This article will introduce how to use the Simple HTML Dom parser.

Script ec (2); script

1. Start Using

First download and decompress the file, and then include the simple_html_dom.php file into the script file to be compiled. Load the html to be processed. Three Modes of html loading are supported: "load from url, load from a string and from a file 』.

The Code is as follows:

Require_once ('simple _ html_dom.php ');
// Load from url
$ Html = file_get_html ('HTTP: // www.111cn.net ');
// Load from string
$ Html = str_get_html ('Hello World!');
// Load from a file
$ Html = file_get_html('example.htm ');
To load an online file from a string, you must first download the file from the network. It is better to use cURL. You need to open the php extension php_curl in the php configuration file.

$ Url = 'HTTP: // www.111cn.net ';
$ Ci = curl_init ();
Curl_setopt ($ ci, CURLOPT_URL, $ url );
Curl_setopt ($ ci, CURLOPT_SSL_VERIFYPEER, false );
Curl_setopt ($ ci, CURLOPT_SSL_VERIFYHOST, false );
Curl_setopt ($ ci, CURLOPT_RETURNTRANSFER, 1 );
$ Result = curl_exec ($ ch );

2. Search for html elements
Use the find function to search and return an array containing objects. Common searches are as follows.

The Code is as follows:
// Search for hyperlink Elements
$ Alink = $ html-> find ('A ');
// Query the n-th join Element
$ Alink = $ html-> find ('A', 5 );
// Find the p with the id of main
$ MainDiv = $ html-> find ('P [id = main] ');
// Find all the p defined by id
$ IdDiv = $ html-> find ('P [id] ');
// Search for all elements with IDs defined
$ IdAll = $ html-> find ('[id]');
// Search for elements whose style class is info
$ ClassInfo = $ html-> find ('. info ');
// Supports searching nested child elements
$ Ret = $ html-> find ('ul li ');
// Search for Multiple html elements
$ Ret = $ html-> find ('a, img, p ');
//....

3. Miscellaneous
You can use built-in functions to locate elements, return the parent element parent, return the child element array children, return the first child element first_child, and return the last child element last_child, returns the prev_sibling of the first adjacent element, and the next_sibling of the last adjacent element.

A simple regular expression is provided to filter attribute selectors, similar to the format of [attribute.

Each object has four basic attributes:
Tag-returned html tag Name
Innertext-return innerHTML
Outertext-return outerHTML
Plaintext-return the text in the HTML Tag

Returned element property value

// Returns the href value of $ alink.
$ Link = $ alink-> href;
You can add, modify, or delete an element by setting its attribute values.

The Code is as follows:

// Delete a url Connection
$ Alink-> href = null;
// Element Modification
$ Ret-> outertext ='

'. $ Ret-> outertext .'

';
$ Ret-> outertext = '';
$ Ret-> outertext = $ ret-> outertext .'

Other

';
$ Ret-> outertext ='

Welcome

'. $ Ret-> outertext;
-EOF-

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.