[Python] Google Translate applets

Source: Internet
Author: User

A translation function is used in the project, that is, after the translation is submitted to Google, the returned results are obtained.

First, find out the context of Google translation:

Request Processing

After submitting the translation, check what the request and response are:

VcHLzOG9u7XEVVJMPC9zdHJvbmc + PC9wPgo8cD48L3A + CjxwcmUgY2xhc3M9 "brush: java;"> url = httl: // translate.google.cn/translate_a/t

The preceding submission form is shown in the following figure:


Sl = source language = en (english)

Tl = target language = zh-CN (Simplified Chinese)

And encoding method: UTF-8

Q = query = "this is a dog"

So we get our post_date

values = {'client': 't', 'sl': 'en', 'tl': 'zh-CN', 'hl': 'zh-CN', 'ie': 'UTF-8', 'oe': 'UTF-8', 'prev': 'btn',                 'ssel': '0', 'tsel': '0', 'q': text}

Check the header information again:


There is a browser, and we do not know if it is necessary. We also add a browser information:

browser = "Mozilla/5.0 (Windows NT 6.1; WOW64)"

Integrate the above information to obtain the request

values = {'client': 't', 'sl': 'en', 'tl': 'zh-CN', 'hl': 'zh-CN', 'ie': 'UTF-8', 'oe': 'UTF-8', 'prev': 'btn',                 'ssel': '0', 'tsel': '0', 'q': text}        url = "http://translate.google.cn/translate_a/t"        data = urllib.urlencode(values)        req = urllib2.Request(url, data)        browser = "Mozilla/5.0 (Windows NT 6.1; WOW64)"        req.add_header('User-Agent', browser)

Then we get the page

response = urllib2.urlopen(req)        get_page = response.read()

Response Processing

What is the response information we can see:

However, it cannot be seen that after submitting long sentences, we can know that the format returned by Google translation is: (too long to write)

[[1st sentence translation, original text, pronunciation], [second sentence translation, original text, pronunciation],...], [Other information (meaning and so on)]

Therefore, we can use the following two-step Regular Expression matching to obtain the text:

text_page = re.search('\[\[.*?\]\]', get_page).group()rex = re.compile(r'\[\".*?\",')re.findall(rex, text_page)

Finally, we find that there are additional "waits" in the text, and further replace and process them:

item = item.replace('[', "")item = item.replace('",', "")tem = item.replace('"', "")

Final program

import reimport urllibimport urllib2def translate(text):    """translate English to Chinese"""    alues = {'client': 't', 'sl': 'en', 'tl': 'zh-CN', 'hl': 'zh-CN', 'ie': 'UTF-8', 'oe': 'UTF-8', 'prev': 'btn',                 'ssel': '0', 'tsel': '0', 'q': text}    url = "http://translate.google.cn/translate_a/t"    data = urllib.urlencode(values)    req = urllib2.Request(url, data)    browser = "Mozilla/5.0 (Windows NT 6.1; WOW64)"    req.add_header('User-Agent', browser)    response = urllib2.urlopen(req)    get_page = response.read()    text_page = re.search('\[\[.*?\]\]', get_page).group()    text_list = []    rex = re.compile(r'\[\".*?\",')    for item in re.findall(rex, text_page):        item = item.replace('[', "")        item = item.replace('",', "")        item = item.replace('"', "")        text_list.append(item)    text_result = "".join(text_list)    return text_result

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.