Python practice 2: view the encoding of a webpage. python practice
Programming Environment: Virtual linux (cygwin in windows)
Identify the encoding method of a webpage,
Usage: python coding http: // www. ***. com
Test results:
Encoding of Web Page capture in python
If the BeautifulSoup version is incorrect, you can use 3.03.
Import urllib2
From BeautifulSoup import BeautifulSoup
F = urllib. urlopen ('www .baidu.com ')
Html = f. read ()
F. close ()
Soup = BeautifulSoup ()
Soup. feed (html)
Print soup
Python webpage capture Encoding
# Encoding: utf-8import urllib2url = 'f = urllib2.urlopen (url) content = f. read () f. close () content = content. decode ('utf-8 '). encode ("gb2312") s1 = content. split (',') result1 ={} for s in s1: s2 = s. split ('|') print s2 [1] result1 [s2 [1] = s2 [0] print result1 can be tested ~
In addition, it is normal to print out the items in the dictionary as a string of percent signs or something ~