I used php simple html dom parser to parse html tags and used PHP Simple HTML DOM Parser to parse HTML pages. It feels pretty good. It can create a DOM tree for you to parse the content in html. It's good to catch things. Here is an example: Scraping data with PHP Simple html dom Parser, written in PHP5 +, allows you to manipulate HTML in a very easy way. supporting invalid HTML, this parser is better then other PHP scripts using complicated regexes to extract information from web pages. before getting the necessary info, a DOM shocould be created from either URL or file. the fo Lowing script extracts links & images from a website: view plain copy to clipboard print? Php code // Create DOM from URL or file $ html = file_get_html ('HTTP: // www.microsoft.com /'); // Extract links foreach ($ html-> find ('A') as $ element) echo $ element-> href. '<br>'; // Extract images foreach ($ html-> find ('img ') as $ element) echo $ element-> src. '<br>'; [php] // Create DOM from URL or file $ html = file_get_html ('HTTP: // www.microsoft.com /'); // Extract links foreach ($ html-> find ('A') as $ elem Ent) echo $ element-> href. '<br>'; // Extract images foreach ($ html-> find ('img ') as $ element) echo $ element-> src. '<br>'; // Create DOM from URL or file $ html = file_get_html ('HTTP: // www.microsoft.com /'); // Extract linksforeach ($ html-> find ('A') as $ element) echo $ element-> href. '<br>'; // Extract imagesforeach ($ html-> find ('img ') as $ element) echo $ element-> src. '<br>'; The parser can also be used To modify HTML elements: view plain copy to clipboard print? Php code // Create DOM from string $ html = str_get_html ('<div id = "simple"> Simple </div> <div id = "parser"> Parser </div> '); $ html-> find ('div ', 1)-> class = 'bar'; $ html-> find ('div [id = simple]', 0) -> innertext = 'foo'; // Output: <div id = "simple"> Foo </div> <div id = "parser" class = "bar"> Parser </div> echo $ html; [php] // Create DOM from string $ html = str_get_html ('<div id = "simple"> Simple </div> <div id = "parser"> Pa Rser </div> '); $ html-> find ('div', 1)-> class = 'bar '; $ html-> find ('div [id = simple] ', 0)-> innertext = 'foo'; // Output: <div id = "simple"> Foo </div> <div id = "parser" class = "bar"> Parser </div> echo $ html; // Create DOM from string $ html = str_get_html ('<div id = "simple"> Simple </div> <div id = "parser"> Parser </div> '); $ html-> find ('div ', 1)-> class = 'bar'; $ html-> find ('div [id = simple]', 0) -> innertext = 'foo'; // Outp Ut: <div id = "simple"> Foo </div> <div id = "parser" class = "bar"> Parser </div> echo $ html; do you wish to retrieve content without any tags? View plain copy to clipboard print? Php code echo file_get_html ('HTTP: // www.yahoo.com/')-> plaintext; [php] echo file_get_html ('HTTP: // www.yahoo.com/')-> plaintext; echo file_get_html ('HTTP: // www.yahoo.com/')-> plaintext; In the package files of this parser ([url] success) you can find some scraping examples from digg, imdb, slashdot. let's create one that extracts the first 10 results (titles only) for t He keyword "php" from Google: view plain copy to clipboard print? Php code $ url = 'HTTP: // www.google.com/search? Hl = en & q = php & btnG = search'; // Create DOM from URL $ html = file_get_html ($ url ); // Match all 'A' tags that have the class attribute equal with 'L' foreach ($ html-> find ('a [class = l] ') as $ key => $ info) {echo ($ key + 1 ). '. '. $ info-> plaintext. "<br/> \ n";} [php] $ url = 'HTTP: // www.google.com/search? Hl = en & q = php & btnG = search'; // Create DOM from URL $ html = file_get_html ($ url ); // Match all 'A' tags that have the class attribute equal with 'L' foreach ($ html-> find ('a [class = l] ') as $ key => $ info) {echo ($ key + 1 ). '. '. $ info-> plaintext. "<br/> \ n" ;}$ url = 'HTTP: // www.google.com/search? Hl = en & q = php & btnG = search'; // Create DOM from URL $ html = file_get_html ($ url ); // Match all 'A' tags that have the class attribute equal with 'L' foreach ($ html-> find ('a [class = l] ') as $ key => $ info) {echo ($ key + 1 ). '. '. $ info-> plaintext. "<br/> \ n";} NOTE Make sure to include the parser before using any functions of it: view plain copy to clipboard print? Php Code include 'simple _ html_dom.php '; [php] include 'simple _ html_dom.php '; for more information regarding the usage of this function consider checking the 'php Simple HTML Dom Parser 'Manual. to download the package files use the following URL: [url] share: