Python3 realize QQ robot automatically crawl Baidu Library search results and send to friends (mainly reptiles)

Source: Internet
Author: User

First, the effect is as follows:

Second, the operating environment:

WIN10 system; Python3;pycharm

Third, the QQ robot is Qqbot module

Install with pip command is: Pip install Qqbot (if you need to have requests library)

To achieve their own robot: online several kinds of writing, very simple, but sometimes the environment will be different errors, the following is the pro-test can be run:

From qqbot import qqbotslot as Qqbotslot, RunBot

@qqbotslot

def onqqmessage (bot, contact, member, content):

 if content = = "-hello": #content是好友发的信息       
Bot. SendTo (Contact, "I am, QQ robot")
if __name__ = = "__main__":
RunBot ()
Four, crawl Baidu Library
Module Required:Importurllib.request,Urllib,RE
Get original page code:
In advance, the Baidu Library page code is gb2312
def Baidu (self,world):
data={}
data[' word '= World
Url_world=urllib.parse.urlencode (Data,encoding="GBK")
url = "Https://wenku.baidu.com/search?" +url_world+"&ORG=0&IE=GBK"
page = urllib.request.urlopen (URL)
html = page.read ()
html = html.decode (' GBK ')
Code parsing:
data[' word '] = World #world是搜索的内容, which is the keyword
URL does not have to say is the web link
But there is a line of code between them:Url_world=urllib.parse.urlencode (Data,encoding="GBK")
look at the Baidu Library Search "University":HTTPS://WENKU.BAIDU.COM/SEARCH?WORD=%B4%F3%D1%A7&ORG=0&IE=GBK
Which%b4%f3%d1%a7 is the hex of "university".
In other words, we want to search the "university" of the relevant content to the "university" of the Chinese into the above format, if not the transfer of what happens
We directly put the Chinese "university" into the link to visit: https://wenku.baidu.com/search?word= University &LM=0&OD=0&FR=TOP_HOME&IE=GBK

There will be garbled this garbled directly resulting in the subsequent acquisition of the original page decoding error is:
        html = Html.decode (' GBK ')
Decoding is to allow Chinese to display normally, but the above garbled characters can not be decoded by GBK encoding, there will be errors.
So you can't just put the parameter world directly in.
The line of Urllib.parse.urlencode (data) is to convert Chinese into URL format.
However, the default encoding is Utf-8, the data will be placed directly in accordance with the Utf-8 will be the following link:
Https://wenku.baidu.com/search?word=%E5%A4%A7%E5%AD%A6&org=0&ie=gbk
This link gets the page effect is the same as before the Chinese put in the link effect
Found on the internet is basically this way of writing, but Baidu Library is the use of gb2312 encoding so need to add a coding parameter in that line transcoding code can achieve the purpose
as follows: Urllib.parse.urlencode (data,encoding= "GBK")
This way, either the URL transcoding or the subsequent decoding will work correctly.
This process will be able to get normal search results on the original page.
Get the information you want with a regular:

The above code will be able to take the title and the corresponding link to extract it
The rest is the fault-tolerant problem.
The full code is as follows:



Python3 realize QQ robot automatically crawl Baidu Library search results and send to friends (mainly reptiles)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.