After judging the character set, if you want to display Chinese, print is required. Examples are as follows:
Import Urllib2
Import re
page = 1
url = ' http://www.qiushibaike.com/hot/page/' + str (page)
User_agent = ' mozilla/4.0 (compatible; MSIE 5.5; Windows NT) '
headers = {' User-agent ': user_agent}
Out_file = open ("Qiushibaike.txt", "W")
Request = Urllib2. Request (url,headers = headers)
Response = Urllib2.urlopen (Request)
Buf=response.read ()
Out_file.write (BUF)
Out_file.close ()
List_jpg=re.findall (R ' http://.+\.jpg ', buf)
List_joketxt=re.findall (R ' <span>.+</span> ', buf)
Print buf #输出网页源文件, format correct, Chinese display normal
# List_jpg=re.findall (R '
List_jpg=re.findall (R ' http://.+\.jpg ', buf)
List_joketxt=re.findall (R ' <span>.+</span> ', buf)
Print List_joketxt #显示不正确, Chinese display is not normal
Print list_joketxt[0] #输出正确, Chinese display normal
For Jok in List_joketxt:
Print Jok #输出正确, Chinese display normal
Python prints Chinese characters with print