This article mainly introduces Python using Chardet to judge the character encoding method, in a more detailed analysis of Python chardet function, installation and use of skills, the need for friends can refer to the

Python Chardet used to implement the string/file encoding detection template

1, Chardet download and installation

Download Address: Http://

After downloading the Chardet, unzip the Chardet compressed package, put the Chardet folder in the application directory directly, you can use the import Chardet to start using Chardet, or you can copy the Chardet to the Python system directory, So all your Python programs just use the import chardet.


2. Example

In use, Chardet.detect () is returned to the dictionary, where confidence is the detection precision and encoding is the encoded form

(1) Web coding judgment:


1 2 3 4 5 >>> Import urllib >>> rawdata = Urllib.urlopen (''). Read () >>> Import Chardet >>> Chardet.detect (rawdata) {' confidence ': 0.98999999999999999, ' encoding ': ' GB2312 '}

(2) file coding judgment


1 2 3 4 5 6 7 Import Chardet tt=open (' c:111.txt ', ' RB ') Ff=tt.readline () #这里试着换成read (5) can also be, but ReadLines () after the error enc=chardet.detect ( FF) print enc[' encoding '] tt.close ()

