Today, write a simple program that captures the content of the Web page according to the specified URL and then stores it in a local file. This program will involve network requests and file operations and other knowledge points, the following is the implementation code:
// fetch.gopackage mainimport ( "os" "fmt" "net/http" "io/ioutil")func main() { url := os.Args[1] // 根据URL获取资源 res, err := http.Get(url) if err != nil { fmt.Fprintf(os.Stderr, "fetch: %v\n", err) os.Exit(1) } // 读取资源数据 body: []byte body, err := ioutil.ReadAll(res.Body) // 关闭资源流 res.Body.Close() if err != nil { fmt.Fprintf(os.Stderr, "fetch: reading %s: %v\n", url, err) os.Exit(1) } // 控制台打印内容 以下两种方法等同 fmt.Printf("%s", body) fmt.Printf(string(body)) // 写入文件 ioutil.WriteFile("site.txt", body, 0644)}
In the code above, we introduced the net/http
network packet, then called the http.Get(url)
method to get the URL corresponding to the resource, then read out the resource data, and then print in the console, and write the content to the local file.
It should be noted that after reading the resource data, the resource flow should be closed in time to avoid the leakage of memory resources.
In addition, when dealing with exceptions, we used fm.Fprintf()
this method, which is one of the three methods of formatting:
Printf
: Formats the string and outputs it to os.Stdout
medium.
Fprintf
: Formats the string and outputs it to the specified file device, so the parameter is one more file pointer than printf FILE*
.
Sprintf
: Formats the string and outputs it to the specified string, so the parameter is one more than printf char*
, which is the destination string address.
Run the program after compiling, and specify a URL parameter, here is temporarily designated as Baidu Bar, or hope that Google can be returned in the near future:
$ ./fetch http://www.baidu.com
After running the program, a file is generated in the current directory site.txt
.
Golang Series article: Crawling Web content