An analysis of Python coding problems

Source: Internet
Author: User

<title>An analysis of Python coding problems</title> Http://www.th7.cn/Program/Python/201303/128631.shtml
2013-03-11 07:49:40--Hits: 40
More 0


? First of all, these problems are only python2. The x version appears since 3. In the X version, the Python environment has only a Unicode type of string, which is automatically converted to a Unicode string by all processes in the program. So 2. How to avoid and clarify the coding problem in the development of Python program X? First of all to maintain a good uniform rules, otherwise everything is a white pull, unified with Utf-8 is the best.
1. Handling Non-ASCII encoding


The default encoding for Python is ASCII encoding, which often occurs when a non-ASCII encoding is processed in the middle of Python:
Unicodedecodeerror: ' ASCII ' codec can ' t decode byte 0x?? In position 1:ordinal not in range (128)
0x?? is a number that exceeds 128.
We often add encoding attributes at the beginning of the file: #-*-Coding=utf8-*-
So it is not possible to process other encodings by setting Python's default encoding to the required encoding, mainly with the following 2 methods:
01. Preferred Method
Import sys reload (SYS) #重新加载sys
Sys.setdefaultencoding (' Utf-8 ') #看你的编码需要utf-8 or gb2312
Why do I have to reload the SYS module first when I call setdefaultencoding? Because the import statement here is actually not SYS's first import statement, that is, this may be the second to third time the Sys module imports, here is only a reference to the SYS, only reload can be reloaded; then why Reload? Instead of calling the function directly? Because the setdefaultencoding function is deleted after being called by the system, it is not already in the import reference, so it must be reload once sys module, so setdefaultencoding will be available To modify the current character encoding of the interpreter in the code.
02. Method of Global setting
Create a new sitecustomize.py file under the Python lib/site-packages folder (sitecustomize.py is a special file that Python will attempt to load at startup, so all code will run the file). Code can be set automatically.
Import Sys
Sys.setdefaultencoding (' gb2312 ')
3. Check the current encoding
Import Sys
Sys.getdefaultencoding ()
An analysis of Python coding problem-Insun-minghacker is Insun


2. Character encoding judgment
The encoding and detection of strings/files can be realized by Chardet.
Installation of Chardet.
The Easy_install tool enables quick installation of the Chardet command as follows: Easy_install.exe Chardet
Use of Chardet.
Chardet can directly use the Detect function to detect the encoding of the given character. The return value of the function is a dictionary, with 2 meta-numbers, one is the credibility of the detection, and the other is the detected encoding.
Import Urllib
Import Chardet
RawData = Urllib.urlopen (' http://www.sina.com.cn/'). Read ()
Print Chardet.detect (rawdata)
#result: {' confidence ': 0.99, ' encoding ': ' GB2312 '}




3. Decoding of file processing
Response = Urllib.urlopen (URL)
Text = Response.read (). Decode ("Utf-8") #add by Insun
Follow the first step to set the UTF8 encoding and then write a crawl mp3 program stored mp3 name is garbled print out the missing is the correct Chinese
Han 锛 Xuan mp3.

This time, obviously, it needs decoding.


Decode ("Utf-8")
We do not go far aside from the BOM head problem.




4.Python operation MySQL Chinese garbled problem
Python operation MySQL requires installation of Python-mysql
Can be searched from the Internet, and the same as the normal Python package installation

Once installed, the module name is MYSQLDB and can be used in Windows and Linux environments

Use the following measures to ensure that MySQL output is not mess:
1 python file Set encoding utf-8 (file front plus #encoding =utf-8)
2 MySQL Database charset=utf-8
3 python connection mysql is plus parameter Charset=utf8
4 Set Python's default encoding to Utf-8 (sys.setdefaultencoding (Utf-8)

#encoding =utf-8



Import Sys

Import MySQLdb



Reload (SYS)

Sys.setdefaultencoding (' Utf-8 ')



Db=mysqldb.connect (user= ' root ', charset= ' UTF8 ')

Cur=db.cursor ()

Cur.execute (' Use MyDB ')

Cur.execute (' select * from MYTB limit 100 ')



F=file ("/home/user/work/tem.txt", ' W ')



For I in Cur.fetchall ():

F.write (str (i))

F.write ("")



F.close ()

Cur.close ()

From for notes (Wiz)

An analysis of Python coding problems

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.