In the article Python uses XSLT to extract Web page data , use XSLT to extract the content on the target page. Example of a small program that assigns a long segment of XSLT directly to a variable. The source does not say how this paragraph of XSLT came about.
Some netizens have questions, this XSLT is so long, writing is not going to take a long time?
The reality is that this XSLT is automatically generated by Gooseeker's intuitive labeling of several units, which is done in 1 minutes.
The following is an example of a list of forum posts in the program as an example, to bring you specific ways to operate:
The first step, open Gooseeker for several units, enter the URL to crawl;
The second step, in the browser display window to find a few, directly select the content to extract, and a name, click to confirm;
650) this.width=650; "id=" aimg_865 "src=" http://www.gooseeker.com/doc/data/attachment/forum/201605/17/ 104004yistw2es1itgiswx.png "class=" Zoom "width=" "height=" 451 "style=" margin-top:10px; "alt=" 104004yistw2es1itgiswx.png "/>
In the third step, the "test" button of the workbench is clicked, and the XSLT is generated and displayed in the Data Rules window.
650) this.width=650; "id=" aimg_864 "src=" http://www.gooseeker.com/doc/data/attachment/forum/201605/17/ 104004nytml8yx7lmxym7i.png "class=" Zoom "width=" "style=" margin-top:10px; "alt=" 104004nytml8yx7lmxym7i.png "/ >
Through the above operation, without programming, with a graphical interface directly on the page, 1 minutes to generate XSLT.
This article from "Fullerhua blog" blog, declined reprint!
1 minutes to quickly generate XSLT for Web page content extraction