Getting started with python crawler-using requests to build zhihu API (3) and pythonrequests
Preface
In the crawler series, the elegant HTTP library requests describes how to use requests. This time, we use requests to build a zhihu API. The functions include: private Message sending, article thumb ups, user attention, etc., because any function involving user operations must be logged on before the operation, so before reading this article, we recommend that you first understand Python simulated login. Now, assume that you know how to use requests to simulate logon.
Train of Thought Analysis
The process of sending a private message is that the browser sends an HTTP request to the server. The request message includes the request URL, request Header, and request Body. You only need to clarify the information, then it is easy to use requests to simulate a browser to send private messages.
Open the Chrome browser, find a user, and click send private message to track the network request process of the private message.
First look at the request header information
The request Header contains the cookie logon information. In addition, there is an authorization field, which is used for user authentication and also exists in cookies (to prevent cookie information leakage, I typed a mosaic). This information must be included in requests.
Let's take a look at the request URL and body.
The request URL is the https://www.zhihu.com/api/v4/messages, the request method is POST, the request body
{"Type": "common", "content": "Hello, my name is pythoner", "receiver_hash": "1da75b85900e00adb072e91c56fd9149 "}
The request body is a json string. the type and content are well understood. However, receiver_hash does not know what it is and needs to be further determined. However, you should guess that this is a field similar to the user id.
Now the question is, how can I find the user id through the URL of the user homepage? In order to completely simulate the entire process of private messages, I specially registered a zhihu account.
If you do not have any additional phone number, you can use Google Search "receive sms online", many online provide free online receive sms mobile phone number, I registered small home page: https://www.zhihu.com/people/xiaoxiaodouzi
First try to follow the trumpet, then find it in the list of my attention, move the mouse to the avatar of the trumpet, there is an HTTP network request.
The request url is a https://www.zhihu.com/api/v4/members/xiaoxiaodouzi, followed by the "xiaoxiaodouzi" that corresponds to the back part of the URL of the small home page, which we call url_token.
The data returned by the API is the personal public information of the user.
{ ... "id":"1da75b85900e00adb072e91c56fd9149", "favorite_count":0, "voteup_count":0, "commercial_question_count":0, "url_token":"xiaoxiaodouzi", "type":"people", "avatar_url":"https://pic1.zhimg.com/v2-ca13758626bd7367febde704c66249ec_is.jpg", "is_active":1492224390, "name":"\u6211\u662f\u5c0f\u53f7", "url":"http://www.zhihu.com/api/v4/people/1da75b85900e00adb072e91c56fd9149", "gender":-1 ...}
We can clearly see that there is an id field. As we have guessed before, the receiver_hash field in the private message is the user id.
Code Implementation
At this point, we have clarified the idea of the private message function, and the code implementation is a matter of course.
User Information
To obtain the receiver_hash dictionary required by the private message interface, we first need to obtain the user information, which contains the id value used.
@ Need_logindef user (self, url_token): "Get user information: param url_token: url_token is part of the user home page url. For example: The https://www.zhihu.com/people/xiaoxiaodouzi url_token is xiaoxiaodouzi: return: dict "" response = self. _ session. get (URL. profile (url_token) return response. json ()
Send private message
@ Need_logindef send_message (self, user_id, content): "sends a private message to the specified user: param user_id: User ID: param content: private Message content "data = {" type ":" common "," content ": content," receiver_hash ": user_id} response = self. _ session. post (URL. message (), json = data) data = response. json () if data. get ("error"): self.logger.info ("failed to send private message, % s" % data. get ("error "). get ("message") else: self.logger.info ("sent successfully") return data
The above two methods are put in a class named Zhihu. I only list the key code. @ need_login is a user-authenticated decorator, indicating that the method can be operated only after login. You may find that I did not explicitly specify the Header field in each request, because I put it in the _ init _. py Method for initialization.
def __init__(self): self._session = requests.session() self._session.verify = False self._session.headers = {"Host": "www.zhihu.com", "Referer": "https://www.zhihu.com/", 'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36' ' (KHTML, like Gecko) Chrome/56.0.2924.87', } self._session.cookies = cookiejar.LWPCookieJar(filename=cookie_filename) try: self._session.cookies.load(ignore_discard=True) except: pass
Call execution
From zhihu import Zhihuif _ name _ = '_ main _': zhihu = Zhihu () profile = zhihu. user ("xiaoxiaodouzi") _ id = profile. get ("id") zhihu. send_message (_ id, "Hello, this is a greeting from the Zen of Python ")
After the execution is complete, the account successfully receives the private message I sent.
Finally, we can follow similar ideas to focus on users, thumb ups, and other functions.
Source Code address: https://github.com/lzjun567/zhihu-api
Http://xiazai.jb51.net/201705/yuanma/zhihu-api (jb51.net).rar
Summary
The above is all about this article. I hope this article will help you learn or use python. If you have any questions, please leave a message, thank you for your support.