Get the information of the Web page get way
#coding =utf-8#pip install requests# Direct get to Web page information import requestsfrom BS4 Import beautifulsoupresponse = Requests.get (' https://www.sogou.com/web?query= infrastructure ') print (Response.text) #打印搜索出来的全部信息 # find <div from Response.text class = ' wrwrap> </div>soup = BeautifulSoup (Response.text, ' html.parser ') new_list = Soup.find_all (name= ' div ', class_= ' Vrwrap ') print (new_list) #可以继续从 <div class = ' wrwrap> </div> continue to find
1. Error codes
Traceback (most recent): File "d:/pycharmprojects/crawler/day1/s1.py", line <module> print ( new_list) Unicodeencodeerror: ' GBK ' codec can ' t encode character ' \xa0 ' in position 2490:illegal multibyte sequence
2. Incorrect encoding format
3. Change all to Utf-8
4. Successful execution
Day1 unicodeencodeerror: ' GBK ' codec can ' t encode character ' \xa0 ' in position 2490:illegal multibyte sequence error hints