Character encoding in python2.x is a headache, but it has been solved for a long time. When I browsed the web page today, I saw that pythoner was asking about Python encoding, write a write encoding problem. The record is as follows:
First lookCode, While reading and explaining
# Coding = UTF-8 <br/> Import sys <br/> print sys. getdefaultencoding () # --> ASCII <br/> U1 = 'China' <br/> Print type (U1), repr (U1) # --> <type 'str'> '/xe4/xb8/XAD/xe5/x9b/xbd' <br/> u2 = u'china' 2009 '<br/> Print type (U2 ), repr (U2) # --> <type 'unicode '> U'/u4e2d/u56fd2009' </P> <p> # str --> Unicode <br/> Print '# str --> Unicode '<br/> u1_1 = u1.decode ('utf8 ') <br/> Print type (u1_1), repr (u1_1) # --> <type 'unicode '> U'/u4e2d/u56fd' <br/> u1_2 = Unicode (U1, 'utf8') <br/> Print type (u1_2), repr (u1_2) # --> <type 'unicode '> U'/u4e2d/u56fd' </P> <p> # Unicode --> STR <br/> Print '# unicode --> STR '<br/> u2_1 = u2.encode ('utf8 ') <br/> Print type (u2_1), repr (u2_1) # --> <type 'str'> '/xe4/xb8/XAD/xe5/x9b/xbd2009' <br/> u2_2 = u2.encode ('gbk ') <br/> Print type (u2_2), repr (u2_2) # --> <type 'str'> '/xd6/xd0/xb9/xfa2009' <br/> u2_3 = u2.encode ('gb2312 ') <br/> Print type (u2_3), repr (u2_3) # --> <type 'str'> '/xd6/xd0/xb9/xfa2009' <br/>
The first line in the Code # Coding = utf8 is used to specify the encoding in the current file. If not specified, the system default encoding is used.
The code that SYS. getdefaultencoding () sees
In the code, the left side isProgramOn the right side is the input, which can be easily understood by comparison.
Remember the following two points:
1. In the program, the string starts with the U mark, that is, all uses Unicode. Otherwise, Chinese characters may encounter problems when the string is added.
2. SQL statements do not use the U mark, and input parameters are marked with the U mark. Otherwise, database queries may encounter inexplicable problems, such
SQL = 'select * from test where name = % s limit 1'
Params = (u'china ',)
Cursor.exe cute (SQL, Params)