Use phpQuery to collect web pages. PhpQuery is an open-source PHP-based server project that allows PHP developers to easily process DOM document content, such as getting headlines from a news website. More interestingly, phpQuery is an open-source PHP-based server project. it allows PHP developers to easily process DOM documents, such as getting headlines from a news website. What's more, it uses the idea of jQuery. you can process the page content like jQuery to get the page information you want.
Collect headlines
Let's take a look at an example. now I want to collect news headlines from Sina. the code is as follows:
The code is as follows:
Include 'phpquery/phpQuery. php ';
PhpQuery: newDocumentFile ('http: // www.jb51.net ');
Echo pq (". blkTop h1: eq (0)")-> html ();
A simple three-line code can get the headlines. First, the program contains the phpQuery. php core program, then calls to read the target webpage, and finally outputs the content under the corresponding tag.
Pq () is a powerful method. Unlike jQuery's $ (), jQuery's selector can basically be used in phpQuery, as long as "." is changed to "-> ". In the preceding example, pq (". blkTop h1: eq (0) ") captures the DIV element whose class attribute is blkTop on the page, finds the first h1 tag inside the DIV, and then uses html () method to obtain the content (with html tags) in the h1 tag, that is, the Toutiao information we want to obtain. if you use the text () method, only the text content of the headlines is obtained. Of course, to use phpQuery well, the key is to find the corresponding content node in the document.
Collect Document List
Next let's take a look at an example to get the blog list of the helloweba.com website. please refer to the code:
The code is as follows:
Include 'phpquery/phpQuery. php ';
PhpQuery: newDocumentFile ('http: // www.jb51.net ');
$ Artlist = pq (". blog_li ");
Foreach ($ artlist as $ li ){
Echo pq ($ li)-> find ('h2 ')-> html ()."";
}
It is so easy to find the article title and output it through the DIV in the loop list.
Parse XML documents
Suppose there is such a test. xml document:
The code is as follows:
Zhang San
22
Wang Wu
18
Now I want to get the age of a contact named Zhang San. the code is as follows:
The code is as follows:
Include 'phpquery/phpQuery. php ';
PhpQuery: newDocumentFile ('test. XML ');
Echo pq ('contact> age: eq (0 )');
Result output: 22
Like jQuery, it is so easy to accurately search for a document node, output the content under the node, and parse an XML document. Now you don't have to use tedious code such as regular expressions and content replacement to collect website content. with phpQuery, everything becomes much easier.
Phpquery project official site address: http://code.google.com/p/phpquery/
Bytes. What's more interesting is that it uses...