Python3 Network Crawler (ii): Use Urllib.urlopen to Youdao translate send data to obtain translation results

Source: Internet
Author: User
Tags urlencode

One, urlopen url parameter Agent

A URL can be not only a string, for example: http://www.baidu.com. The URL can also be a request object, which requires that we first define a request object and then use the Request object as a Urlopen parameter, as follows:

# -*- coding: UTF-8 -*-from urllib import requestif __name__ == "__main__":    req = request.Request("http://fanyi.baidu.com/") response = request.urlopen(req) html = response.read() html = html.decode("utf-8") print(html)
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9

Similarly, running this code can also get web page information. You can see this code and the code in the last note, the difference between the comparison is clear.

The object returned by Urlopen () can be read using read (), and the Geturl () method, info () method, GetCode () method can also be used.

    • Geturl () returns a string that is a URL;

    • info () returns meta-information for some meta tags, including information about some servers;

    • GetCode () returns the status code for HTTP if 200 is returned to indicate that the request was successful.

The contents of Meta tags and HTTP status code can be self-Baidu encyclopedia, which has a very detailed introduction.

Knowing this, we can do a new round of testing, new file name urllib_test04.py, write the following code:

#-*-Coding:utf-8-*-from urllib Import RequestIf __name__ = =  "__main__": req = Request. Request ( "http://fanyi.baidu.com/") response = Request.urlopen (req) print ( "Geturl printing information: %s" % (Response.geturl ())) print ( ' ******* ') print ( "info printing information: %s "% (Response.info ())) print ( ' ********************************************** ') Span class= "Hljs-keyword" >print ( "GetCode printing information: %s" % (Response.getcode ())          
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11

You can get the following running results:

Ii. Data parameters of Urlopen

We can send data to the server by using the parameters. According to the HTTP specification, get is used for information acquisition, and Post is a request to submit data to the server, in other words:

Submit data from the client to the server using Post;

Get data from the server to the client using get (get can also be submitted, temporarily not considered).

If the data parameter of the Urlopen () function is not set, the HTTP request takes the Get method, that is, we get the information from the server, if we set the data parameter, the HTTP request takes the Post method, that is, we pass the data to the server.

The data parameter has its own format, which is a application/x-www.form-urlencoded-based format that we don't need to know because we can use Urllib.parse.urlencode () The function automatically converts the string to the format described above.

Third, send data instance

Send data to Youdao and get the translation results.

1. Open the Youdao translation interface as shown in:

2. Right mouse button check, which is the review element, as shown in:

3. Select the network that appears on the right, as shown in:

4. Enter the translated content on the left, and enter Jack as shown in:

5. Click the Auto-translate button and we'll see what appears on the right, as shown in:

6. Click on the contents of the Red box to view its information, as shown in:

7. Remember this information, which we need to write the program in a moment.

To create a new file translate_test.py, write the following code:

#-*-Coding:utf-8-*-From UrllibImport RequestFrom UrllibImport ParseImport JSONif __name__ = ="__main__":#对应的Request URL Request_url =' Http://fanyi.youdao.com/translate?smartresult=dict&smartresult=rule&smartresult=ugc&sessionFrom= Https://www.baidu.com/link '#创建Form_Data字典, the stored Form Data form_data = {} form_data[' type '] =' AUTO ' form_data[' I '] =' Jack ' form_data[' doctype '] =' JSON ' form_data[' xmlversion '] =' 1.8 ' form_data[' Keyfrom '] =' Fanyi.web ' form_data[ ' UE '] =  ' ue:utf-8 ' Form_data[ Action '] =  ' Fy_by_clickbutton '  #使用urlencode方法转换标准格式 data = Parse.urlencode (Form_data). Encode ( ' utf-8 ') # Pass the request object and the converted format Data response = Request.urlopen (request_url,data)  #读取信息并解码 HTML = Response.read (). Decode ( ' utf-8 ')  #使用JSON Translate_results = Json.loads (HTML)  #找到翻译结果 translate_results = translate_results[  ' Translateresult '] [0][0][ ' TGT ']  #打印翻译信息 print ( "The result of the translation is:%s"% translate_results)    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21st
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29

This allows us to view the results of the translation as shown in:

JSON is a lightweight data interchange format that we need to find in JSON-formatted data from crawled content, which preserves the translation results we want, and then parses the resulting JSON-formatted translation to get what we want to look like: Jake.

Python3 Network Crawler (ii): Use Urllib.urlopen to Youdao translate send data to obtain translation results

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.