This article mainly introduces the use of Python to obtain a page encoding method implementation of the code, the need for friends can refer to the following
Python gets Web page encoding implementation code
<span style= "font-family:arial, Helvetica, Sans-serif; Background-color:rgb (255, 255, 255); " > </span><span style= "font-family:arial, Helvetica, Sans-serif; Background-color:rgb (255, 255, 255);" >python development, automated access to the Web page encoding method used in the Chardet library, character set detection, this class in python2.7 not, need to download on the official website. Here I downloaded the chardet-2.3.0.tar.gz compressed package file, only need to put the compressed package files extracted Chardet file under the Python installation package under the python27/lib/site-packages/, you can. </span>
Then import Chardet
The following is an automated detection function for detecting URL connections, and then returns the encoding of the URL of the Web page.
Import chardet #字符集检测 import urllib url= "http://www.jd.com" def automatic_detect (URL): content= Urllib.urlopen (URL). Read () result=chardet.detect (content) encoding=result[' encoding '] return Encoding urls=[' http://www.baidu.com ', ' http://www.163.com ', ' http://dangdang.com '] for URL in URLs: print Url,automatic_detect (URL)
The above uses the Detect method of the Chardet class, returns the dictionary, and then takes out the encoding method encoding
Thank you for reading, hope to help everyone, thank you for the support of this site!