How to use PHP to replace JavaScript with DOM and sample code
Source: Internet
Author: User
The idea is to use DOMDocument to convert an htmlfile to the data structure of the DOM tree, and then use the DOMXPath instance to search for this DOM tree, next, we can traverse the child tree of the current node. The origin of the event is relatively simple. I need to organize the data on a navigation page and write it into the database. An intuitive method is to analyze html files. a common method is to use php regular expressions for matching. However, development and maintenance are difficult, and the code is very readable.
The data on the navigation page is arranged by rules in the DOM tree. JS can be used to easily operate on it with several loops. JS needs to rely on browsers, making it difficult to operate databases. In fact, PHP has a ready-made class library to add, delete, modify, and query DOM tree nodes. here, I will take some notes.
Here two classes of DOMDocument and DOMXPath are involved.
In fact, the idea is clear, that is, to use DOMDocument to convert an html file to the data structure of the DOM tree, and then use the DOMXPath instance to search for the DOM tree to get the desired node, next, we can traverse the subtree of the current node to get the desired result.
There is such an html file in the current directory as "./hao.html"
Now you need to get the Chinese content of all labels. The php code is as follows:
The code is as follows:
// Convert an html/xml file into a DOM tree
$ Dom = new DOMDocument ();
$ Dom-> loadHTMLFile ("hao.html ");
// Obtain the dl labels with fix for all classes.
// Example 1: for everything with an id
// $ Elements = $ xpath-> query ("// * [@ id]");
// Example 2: for node data in a selected id
// $ Elements = $ xpath-> query ("/html/body/p [@ id = 'yourtagidhere ']");
// Example 3: same as abve with wildcard
// $ Elements = $ xpath-> query ("*/p [@ id = 'yourtagidhere ']");
$ Xpath = new DOMXPath ($ dom );
$ Dls = $ xpath-> query ('// dl [@ class = "fix"]');
Foreach ($ dls as $ dl ){
$ Spans = $ dl-> childNodes;
Foreach ($ spans as $ span ){
Echo trim ($ span-> textContent). "\ t ";
}
Echo "\ n ";
}
?>
The output result is as follows:
Note: The default DOMDocument encoding method is Latin. Therefore, when processing utf-encoded Chinese charactersFollowed
The code is as follows:
In other locations, or write only I don't know anything else.
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.