Golang Series article: Crawling Web content

Source: Internet
Author: User

Today, write a simple program that captures the content of the Web page according to the specified URL and then stores it in a local file. This program will involve network requests and file operations and other knowledge points, the following is the implementation code:

// fetch.gopackage mainimport (    "os"    "fmt"    "net/http"    "io/ioutil")func main() {    url := os.Args[1]    // 根据URL获取资源    res, err := http.Get(url)    if err != nil {        fmt.Fprintf(os.Stderr, "fetch: %v\n", err)        os.Exit(1)    }    // 读取资源数据 body: []byte    body, err := ioutil.ReadAll(res.Body)    // 关闭资源流    res.Body.Close()    if err != nil {        fmt.Fprintf(os.Stderr, "fetch: reading %s: %v\n", url, err)        os.Exit(1)    }    // 控制台打印内容 以下两种方法等同    fmt.Printf("%s", body)    fmt.Printf(string(body))    // 写入文件    ioutil.WriteFile("site.txt", body, 0644)}

In the code above, we introduced the net/http network packet, then called the http.Get(url) method to get the URL corresponding to the resource, then read out the resource data, and then print in the console, and write the content to the local file.

It should be noted that after reading the resource data, the resource flow should be closed in time to avoid the leakage of memory resources.

In addition, when dealing with exceptions, we used fm.Fprintf() this method, which is one of the three methods of formatting:

Printf: Formats the string and outputs it to os.Stdout medium.
Fprintf: Formats the string and outputs it to the specified file device, so the parameter is one more file pointer than printf FILE* .
Sprintf: Formats the string and outputs it to the specified string, so the parameter is one more than printf char* , which is the destination string address.

Run the program after compiling, and specify a URL parameter, here is temporarily designated as Baidu Bar, or hope that Google can be returned in the near future:

$ ./fetch http://www.baidu.com

After running the program, a file is generated in the current directory site.txt .

Golang Series article: Crawling Web content

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.