Python basics-string unicode and python Basics
Unicode is often used to process Chinese Characters in python. Because it is easy to encounter character string Encoding Problems, I generally convert the character strings to unicode for processing.
Define a unicode string in python. You can add u before the string:
Str = u "hello world"
Python defines a non-escape string. You can add r before the string:
Path = r "c: \ programfile \ test"
Decodes and converts other string formats to unicode:
Ret = str. decode ("gb2312") ret = str. decode ("ascii") ret = str. decode ("UTF-8 ")
Encoding converts unicode characters to other string formats:
Ret = str. encode ("gb2312") ret = str. encode ("ascii") ret = str. encode ("UTF-8 ")
Chardef determines the encoding format of a string:
Encode = chardef. detect (str) print encode ['encoding']
String formatting % s
Print "test for % s, value is % d" % ("format", 123)
Generally, # encoding = UTF-8 is added at the beginning of the py file to avoid Chinese garbled characters in the file.
The most important problem in string processing is to know the format of string input. It is easy to handle strings during input.