Exploration of Baidu and Google URL encoding methods

Source: Internet
Author: User

Today I noticed that Baidu and Google have different URL encoding methods.

For example, we search for the word "technology" and then observe the IE Address Bar.

The result obtained by using Baidu is:
Http://www.baidu.com? Wd = % BC % CA % F5 & cl = 3

What is the result from Google?
Http://www.google.com/search? Hl = zh-CN & q = % E6 % 8A % 80% E6 % 9C % AF & lr =

That is, baidu_urlencode ("technology") = % BC % CA % F5, google_urlencode ("technology") = % E6 % 8A % 80% E6 % 9C % AF
Obviously, what Encoding algorithms are used for the two?

Come up with the cute Python to help us solve the problem.
>>> Import urllib
>>> Url = urllib. unquote ('HTTP: // www.baidu.com/s? Wd = % BC % CA % F5 & cl = 3 ')
>>> Url
'Http: // www.baidu.com/s? Wd =/xbc/xca/xf5 & cl = 3'
>>> Print url. decode ('gb2312 ')
Http://www.baidu.com? Wd = Technology & cl = 3
>>>

Obviously, the url encoding method of baidu is gb2312. What about google? Can it be like a bubble?

>>> Url2 = urllib. unquote ('HTTP: // www.google.com/search? Hl = zh-CN & q = % E6 % 8A % 80% E6 % 9C % AF & lr = ')
>>> Url2
'Http: // www.google.com/search? Hl = zh-CN & q =/xe6/x8a/x80/xe6/x9c/xaf & lr ='
>>> Print url2.decode ('gb2312 ')
Traceback (most recent call last ):
File "<input>", line 1, in?
UnicodeDecodeError: 'gb2312 'codec can't decode bytes in position 40-41: illegal multibyte sequence

Unfortunately, an error is reported because gb2312 decoding is incorrect. Try something else, maybe 'utf-8', go on

>>> Print url2.decode ('utf-8 ')
Http://www.google.com/search? Hl = zh-CN & q = Technology & lr =
Yeah. This indicates that google uses UTF-8 to encode the url.

 

Python is so cool!

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.