Python character set parsing, troubleshooting Windows ftpclient download Chinese name file garbled

Last Update:2016-07-19 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

The Chinese encoding in Python has always been an extremely big problem, often throwing out the code conversion exceptions, what exactly is str and Unicode in Python? In Python, referring to Unicode, generally refers to Unicode objects, such as ' haha ' Unicode object is U ' \u54c8\u54c8 ', and STR, is a byte array, this byte array represents the Unicode object encoding (can be utf-8 , GBK, cp936, GB2312) are stored in the format. Here it is just a stream of words, no other meaning, if you want to make this byte stream display content meaningful, you must use the correct encoding format, decoding display

For example:

>>> A = u "Hello" >>> A_utf8 = A.encode ("Utf-8") >>> print A_utf8 raccoon ã ソ>>> A_GBK = A.encode ("G BK ") >>> print a_gbk Hello >>> a_utf8 ' \xe4\xbd\xa0\xe5\xa5\xbd ' >>> a_gbk ' \xc4\xe3\xba\xc3 '

For the Unicode object "Hello" encoding, encoded into a utf-8 encoding, A_utf8 is a byte array, storing is ' \xe4\xbd\xa0\xe5\xa5\xbd ', but this is just a byte array, You cannot output to Hello through the print statement. Because the print statement is the implementation of the output is going to send the operating system, the operating system according to the system encoding the input byte stream encoding, which explains why the utf-8 format string "Hello", the output is "ã ソ", because ' \ Xe4\xbd\xa0\xe5\xa5\xbd ' with GB2312 to explain, its display is "raccoon ã ソ". STR records a byte array, just some encoding of the storage format, as to the output to a file or print out what format, completely depends on the decoding of its encoding to what it looks like. Here's a little bit more on print: When a Unicode object is passed to print, the Unicode object is internally converted and converted to the default encoding of the cost (possibly this way)

Decode and encode

The representation of a string inside Python is Unicode encoding, and in the case of encoding conversion, it is often necessary to use Unicode as the intermediate encoding, that is, decoding the other encoded string (decode) into Unicode, and then from Unicode encoding (encode) to another encoding. Example: Str1.decode (' gb2312 '), which represents the conversion of GB2312 encoded string str1 to Unicode

Str2.encode (' gb2312 '), which represents the conversion of a Unicode-encoded string str2 to gb2312 encoding.

Transcoding must first understand, the string str is what encoding, and then decode into Unicode, and then encode into other encodings, in UTF8 file, the string is UTF8 encoding, if it is in GBK file, it is encoded as GBK. In this case, to encode the conversion, you need to first convert it to Unicode encoding using the Decode method, and then use the Encode method to convert it to another encoding. Typically, you create a code file by using the system default encoding when you do not specify a specific encoding method.

If a string is already Unicode, then decoding will be an error, so it is common to determine whether it is encoded as Unicode:
Isinstance (S, Unicode) #用来判断是否为unicode

Example: Troubleshooting Windows python ftpclient download Chinese file name error

Def downloadfile ():     remotepath = os.path.join (Remotepath, Zname). Encode (' Utf-8 ')     localpath = creatdir ()     localpath  = os.path.join (Localpath, zname). Encode ("GBK")     print  "Start connecting to FTP server ..."     ftp = ftpconnect ()     ftp.set_debuglevel (2)   #打开调试      #print  ftp.getwelcome ()   #显示ftp服务器欢迎信息     bufsize =  1024  #设置缓冲块大小     try:        print   "Start receiving files on server ..."         fp = open (Localpath.decode (' GBK ',  ' WB ')   #以写模式在本地打开文件         ftp.retrbinary (' retr  ')  + remotepath,fp.write,bufsize)   #接收服务器上文件并写入本地文件          logging.debug ("Read remote address is%s" % remotepath.decode ("UTF8"). Encode ("GBK"))          Logging.debug ("%s Download success path is:  %s"  % (zname, localpath))          print  "%s Download success path:  %s"  % (Zname, localpath)          fp.close ()     except Exception, e:         print e        logging.debug ("%s download failed to close file, exit FTP server"  %zname)         print  "Download Failed"          os.remove (LocalPath)     finally:         ftp.quit ()   #退出ftp服务器

This article is from the "left-handed" blog, make sure to keep this source http://mofeihu.blog.51cto.com/1825994/1827563

Python character set parsing, troubleshooting Windows ftpclient download Chinese name file garbled

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Python character set parsing, troubleshooting Windows ftpclient download Chinese name file garbled

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Python character set parsing, troubleshooting Windows ftpclient download Chinese name file garbled

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support