Golang:goquery Simple Crawler Example

Source: Internet
Author: User

The methods provided by the selection types are the most important and core methods of page parsing

1) position manipulation of similar functions

-EQ (index int) *selection//Get a node set by index

-First () *selection//Get the set of child nodes

-Last () *selection//Get final child node set

-Next () *selection//Get Next sibling node set

-Nextall () *selection//Get back all sibling node sets

-Prev () *selection//previous sibling node set

-Get (index int) *html. Node//Get one of the nodes based on the index

-Index () int//Returns the position of the first element in a Selection object

-Slice (start, end int) *selection//Get child node set based on Start location


2) Expand the Selection collection (Increase the selected node)

-Add (Selector string) *selection//Adds the matching node to the current node collection

-Andself () *selection//Adds the previous set of elements on the stack to the current

-Union () *selection//which is a alias for AddSelection ()


3) Filter method, reduce node collection

-End () *selection

-Filter ... ()//filter

-Has ... ()

-Intersection ()//which is an alias of Filterselection ()

-Not ... ()


4) loop through the selected nodes

-Each (f func (int, *selection)) *selection//traversal

-Eachwithbreak (f func (int, *selection) bool) *selection//interruptible traversal

-Map (f func (int, *selection) string) (result []string)//return string array


5) Modify the document

-After ... ()//append element after matching element

-Append ... ()//Adds the element specified by the selector to the end of each element of the matching element collection

-Before ... ()//append element before matching element

-Clone ()//Create a copy of the matching node

-Empty ()//Clearance Node

-Prepend ... ()

-Remove ... ()

-ReplaceWith ... ()

-Unwrap ()

-Wrap ... ()

-Wrapall ... ()

-Wrapinner ... ()


6) Detect or get node attribute values

-Attr (), removeattr (), SetAttr ()//Get, remove, set the value of the property

-AddClass (), Hasclass (), Removeclass (), Toggleclass ()

-HTML ()//Get the HTML of the node

-Length ()//Returns the number of elements of the selection

-Size (), which is a alias for Length ()

-Text ()//Gets the literal value of the node


7) query or display the identity of a single node

-Contains ()//included

-Is ... ()


8) jump back and forth between document trees (Common find node method)

-Children ... ()

-Contents ()

-Find ... ()

-Next ... ()

-Parent[s] ... ()

-Prev ... ()

-Siblings ... ()


Operation Example:

Func Main () {

Client: = http. client{}

Req,_: = http. Newrequest ("GET", "HTTP://WWW.XICIDAILI.COM/WN/1", nil)

Req. Header.add ("User-agent", "mozilla/5.0" (Windows NT 6.1; WOW64) applewebkit/537.36 (khtml, like Gecko) chrome/61.0.3163.79 safari/537.36 maxthon/5.2.3.1000 ")

Resp,_: = client. Do (req)

Doc,_: = Goquery. Newdocumentfromreader (resp. Body)

Log. Print (Doc. Html ())

Doc. Find ("Tbody tr"). Each (func (Iint, Selection *goquery. Selection) {

Proxy: = models. tbspiderproxyip{}

Selection. Children (). Each (func (Iint, Selection *goquery. Selection) {

Switch I {

Case 1:

Proxy. Ip = Selection. Text ()

Case 2:

Port, _: =strconv. parseint (selection. Text (), 10, 64)

Proxy. Port = Port

Case 3:

Proxy. Address = Selection. Text ()

Case 9:

Proxy. Check_date = Selection. Text ()

Default

}

})

Proxy. Https =1

Proxy. Status =1

Proxy. CreateDate = time. Now (). Format ("2006-01-02 15:04:05")

Models. Inserttbspiderproxy (&proxy)

})

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.