# Auther:aaron Fan
‘‘‘
ASCII: Chinese is not supported, 1 English is 1 bytes
Unicode (Universal code, supports text display in all countries): Supports Chinese, but each English and Chinese accounts for 2 bytes
UTF-8 (is a variable-length character encoding for Unicode, also known as the Universal Code.) ):
English still occupies 1 bytes according to ASCII, and all Chinese characters are uniformly 3 bytes.
Unicode supports each country's code conversion, such as when the Chinese GBK format software in Japan when the garbled,
The GBK must be converted to Unicode encoding before it can be displayed properly.
GBK: The full name of "Chinese character Code Extension code" (GBK "GB", "extended" Hanyu Pinyin, the first letter,
English name: Chinese Internal Code specification)
‘‘‘
‘‘‘
1. Turn into Unicode first
2, then converted into GBK
The generalization is: encode first, then decode
‘‘‘
#示例:
‘‘‘
Gbk_file is a GBK encoded file
Demand:
Convert Gbk_file to a new UTF8 encoded file with the new file name: Gbk_to_utf8_file
‘‘‘
#python3一行命令搞定的方法
#把gbk文件转换成utf8文件, source file gbk_file, destination file utf8file:
Open (' Utf8file ', ' w+ ', encoding= ' utf-8 '). Write (Open (' Gbk_file ', ' R ', encoding= ' GBK '). Read ())
#把utf8文件转换成gbk文件, source file utf8file, destination file gbk_file:
Open (' Gbk_file ', ' w+ ', encoding= ' GBK '). Write (Open (' Utf8file ', ' R ', encoding= ' utf-8 '). Read ())
#python2上的实现方式:
#gbk文件转换成utf8文件, source file newfile, destination file utf8file:
Pen (' utf8file ', ' w+ '). Write (Open (' NewFile ', ' R '). Read (). Decode (' GBK '). Encode (' Utf-8 '))
Python-ascii and Unicode