A Tour of Go – Exercise: Web Crawler

來源:互聯網
上載者:User
A Tour of GoExercise: Web Crawler

In this exercise you'll use Go's concurrency features to parallelize a web crawler.

Modify the Crawl function to fetch URLs in parallel without fetching the same URL twice.

package mainimport ("fmt")type Fetcher interface {// Fetch returns the body of URL and// a slice of URLs found on that page.Fetch(url string) (body string, urls []string, err error)}// Crawl uses fetcher to recursively crawl// pages starting with url, to a maximum of depth.func Crawl(url string, depth int, fetcher Fetcher, out chan string, mutuex chan map[string]bool, end chan bool) {// TODO: Fetch URLs in parallel.// TODO: Don't fetch the same URL twice.// This implementation doesn't do either:if depth <= 0 {end <- truereturn}body, urls, err := fetcher.Fetch(url)if err != nil {out <- err.Error()end <- truereturn}fmt.Printf("found: %s %q\n", url, body)visited := <- mutuexsubEnd := make(chan bool)i := 0for _, u := range urls {if !visited[u] {visited[u] = truei ++go Crawl(u, depth-1, fetcher, out, mutuex, subEnd)}}mutuex <- visitedfor ; i == 0 ; i-- {<- subEnd}end <- truereturn}func main() {out := make(chan string)mutuex := make(chan map[string]bool) visited := make(map[string]bool)end := make(chan bool)visited["http://golang.org/"] = truego Crawl("http://golang.org/", 4, fetcher, out, mutuex, end)mutuex <- visitedfor {select {case t:= <- out:fmt.Println(t)case <- end:return}}}// fakeFetcher is Fetcher that returns canned results.type fakeFetcher map[string]*fakeResulttype fakeResult struct {body stringurls     []string}func (f *fakeFetcher) Fetch(url string) (string, []string, error) {if res, ok := (*f)[url]; ok {return res.body, res.urls, nil}return "", nil, fmt.Errorf("not found: %s", url)}// fetcher is a populated fakeFetcher.var fetcher = &fakeFetcher{"http://golang.org/": &fakeResult{"The Go Programming Language",[]string{"http://golang.org/pkg/","http://golang.org/cmd/",},},"http://golang.org/pkg/": &fakeResult{"Packages",[]string{"http://golang.org/","http://golang.org/cmd/","http://golang.org/pkg/fmt/","http://golang.org/pkg/os/",},},"http://golang.org/pkg/fmt/": &fakeResult{"Package fmt",[]string{"http://golang.org/","http://golang.org/pkg/",},},"http://golang.org/pkg/os/": &fakeResult{"Package os",[]string{"http://golang.org/","http://golang.org/pkg/",},},}
相關文章

聯繫我們

該頁面正文內容均來源於網絡整理,並不代表阿里雲官方的觀點,該頁面所提到的產品和服務也與阿里云無關,如果該頁面內容對您造成了困擾,歡迎寫郵件給我們,收到郵件我們將在5個工作日內處理。

如果您發現本社區中有涉嫌抄襲的內容,歡迎發送郵件至: info-contact@alibabacloud.com 進行舉報並提供相關證據,工作人員會在 5 個工作天內聯絡您,一經查實,本站將立刻刪除涉嫌侵權內容。

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.