Python uses Chardet to judge character encodings

Source: Internet
Author: User

This article mainly introduces Python using Chardet to judge the character encoding method, in a more detailed analysis of Python chardet function, installation and use of skills, the need for friends can refer to the

Python Chardet used to implement the string/file encoding detection template

1, Chardet download and installation

Download Address: Http://pypi.python.org/pypi/chardet

After downloading the Chardet, unzip the Chardet compressed package, put the Chardet folder in the application directory directly, you can use the import Chardet to start using Chardet, or you can copy the Chardet to the Python system directory, So all your Python programs just use the import chardet.

?

1 Python setup.py Install

2. Example

In use, Chardet.detect () is returned to the dictionary, where confidence is the detection precision and encoding is the encoded form

(1) Web coding judgment:

?

1 2 3 4 5 >>> Import urllib >>> rawdata = Urllib.urlopen (' http://www.google.cn/'). Read () >>> Import Chardet >>> Chardet.detect (rawdata) {' confidence ': 0.98999999999999999, ' encoding ': ' GB2312 '}

(2) file coding judgment

?

1 2 3 4 5 6 7 Import Chardet tt=open (' c:111.txt ', ' RB ') Ff=tt.readline () #这里试着换成read (5) can also be, but ReadLines () after the error enc=chardet.detect ( FF) print enc[' encoding '] tt.close ()

I hope this article will help you with your Python programming.

Related Article

E-Commerce Solutions

Leverage the same tools powering the Alibaba Ecosystem

Learn more >

Apsara Conference 2019

The Rise of Data Intelligence, September 25th - 27th, Hangzhou, China

Learn more >

Alibaba Cloud Free Trial

Learn and experience the power of Alibaba Cloud with a free trial worth $300-1200 USD

Learn more >

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.