Python implements url encoding for Chinese conversion,
This example describes how to convert Chinese characters into url encoding in python. We will share this with you for your reference. The details are as follows:
Today, I want to deal with Baidu posts. If you want to create a keyword list, you can directly add it to the list whenever you need it. However, the url is encoded as '% E4 % B8 % BD % E6 % B1 % 9f' when Chinese characters are added to the list (for example, 'lijiang ', therefore, a conversion is required. Here we use the urllib module.
>>> Import urllib >>> data = 'lijiang '>>> print data Lijiang >>> data \ xe4 \ xb8 \ xbd \ xe6 \ xb1 \ x9f' >>> urllib. quote (data) '% E4 % B8 % BD % E6 % B1 % 9f'
So we want to go back?
>>> Urllib. unquote ('% E4 % B8 % BD % E6 % B1 % 9f')' \ xe4 \ xb8 \ xbd \ xe6 \ xb1 \ x9f'> print urllib. unquote ('% E4 % B8 % BD % E6 % B1 % 9f') Lijiang
Students will find that % C0 % F6 % BD % AD appears in the Post Bar url, not '% E4 % B8 % BD % E6 % B1 % 9f ', it is actually a coding problem. Baidu is gbk, and other general websites such as google are utf8. Therefore, you can use the following statements.
>>> Import sys, urllib >>> s = 'lijiang '>>> urllib. quote (s. decode (sys. stdin. encoding ). encode ('gbk') '% C0 % F6 % BD % ad' >>> urllib. quote (s. decode (sys. stdin. encoding ). encode ('utf8') '% E4 % B8 % BD % E6 % B1 % 9f' >>>