Parse html tags using php simple html dom parser

Source: Internet
Author: User

I used php simple html dom parser to parse html tags and used PHP Simple HTML DOM Parser to parse HTML pages. It feels pretty good. It can create a DOM tree for you to parse the content in html. It's good to catch things. Here is an example: Scraping data with PHP Simple html dom Parser, written in PHP5 +, allows you to manipulate HTML in a very easy way. supporting invalid HTML, this parser is better then other PHP scripts using complicated regexes to extract information from web pages. before getting the necessary info, a DOM shocould be created from either URL or file. the fo Lowing script extracts links & images from a website: view plain copy to clipboard print? Php code // Create DOM from URL or file $ html = file_get_html ('HTTP: // www.microsoft.com /'); // Extract links foreach ($ html-> find ('A') as $ element) echo $ element-> href. '<br>'; // Extract images foreach ($ html-> find ('img ') as $ element) echo $ element-> src. '<br>'; [php] // Create DOM from URL or file $ html = file_get_html ('HTTP: // www.microsoft.com /'); // Extract links foreach ($ html-> find ('A') as $ elem Ent) echo $ element-> href. '<br>'; // Extract images foreach ($ html-> find ('img ') as $ element) echo $ element-> src. '<br>'; // Create DOM from URL or file $ html = file_get_html ('HTTP: // www.microsoft.com /'); // Extract linksforeach ($ html-> find ('A') as $ element) echo $ element-> href. '<br>'; // Extract imagesforeach ($ html-> find ('img ') as $ element) echo $ element-> src. '<br>'; The parser can also be used To modify HTML elements: view plain copy to clipboard print? Php code // Create DOM from string $ html = str_get_html ('<div id = "simple"> Simple </div> <div id = "parser"> Parser </div> '); $ html-> find ('div ', 1)-> class = 'bar'; $ html-> find ('div [id = simple]', 0) -> innertext = 'foo'; // Output: <div id = "simple"> Foo </div> <div id = "parser" class = "bar"> Parser </div> echo $ html; [php] // Create DOM from string $ html = str_get_html ('<div id = "simple"> Simple </div> <div id = "parser"> Pa Rser </div> '); $ html-> find ('div', 1)-> class = 'bar '; $ html-> find ('div [id = simple] ', 0)-> innertext = 'foo'; // Output: <div id = "simple"> Foo </div> <div id = "parser" class = "bar"> Parser </div> echo $ html; // Create DOM from string $ html = str_get_html ('<div id = "simple"> Simple </div> <div id = "parser"> Parser </div> '); $ html-> find ('div ', 1)-> class = 'bar'; $ html-> find ('div [id = simple]', 0) -> innertext = 'foo'; // Outp Ut: <div id = "simple"> Foo </div> <div id = "parser" class = "bar"> Parser </div> echo $ html; do you wish to retrieve content without any tags? View plain copy to clipboard print? Php code echo file_get_html ('HTTP: // www.yahoo.com/')-> plaintext; [php] echo file_get_html ('HTTP: // www.yahoo.com/')-> plaintext; echo file_get_html ('HTTP: // www.yahoo.com/')-> plaintext; In the package files of this parser ([url] success) you can find some scraping examples from digg, imdb, slashdot. let's create one that extracts the first 10 results (titles only) for t He keyword "php" from Google: view plain copy to clipboard print? Php code $ url = 'HTTP: // www.google.com/search? Hl = en & q = php & btnG = search'; // Create DOM from URL $ html = file_get_html ($ url ); // Match all 'A' tags that have the class attribute equal with 'L' foreach ($ html-> find ('a [class = l] ') as $ key => $ info) {echo ($ key + 1 ). '. '. $ info-> plaintext. "<br/> \ n";} [php] $ url = 'HTTP: // www.google.com/search? Hl = en & q = php & btnG = search'; // Create DOM from URL $ html = file_get_html ($ url ); // Match all 'A' tags that have the class attribute equal with 'L' foreach ($ html-> find ('a [class = l] ') as $ key => $ info) {echo ($ key + 1 ). '. '. $ info-> plaintext. "<br/> \ n" ;}$ url = 'HTTP: // www.google.com/search? Hl = en & q = php & btnG = search'; // Create DOM from URL $ html = file_get_html ($ url ); // Match all 'A' tags that have the class attribute equal with 'L' foreach ($ html-> find ('a [class = l] ') as $ key => $ info) {echo ($ key + 1 ). '. '. $ info-> plaintext. "<br/> \ n";} NOTE Make sure to include the parser before using any functions of it: view plain copy to clipboard print? Php Code include 'simple _ html_dom.php '; [php] include 'simple _ html_dom.php '; for more information regarding the usage of this function consider checking the 'php Simple HTML Dom Parser 'Manual. to download the package files use the following URL: [url] share:

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.