Getting started with python crawler-using requests to build zhihu API (3) and pythonrequests

Source: Internet
Author: User

Getting started with python crawler-using requests to build zhihu API (3) and pythonrequests

Preface

In the crawler series, the elegant HTTP library requests describes how to use requests. This time, we use requests to build a zhihu API. The functions include: private Message sending, article thumb ups, user attention, etc., because any function involving user operations must be logged on before the operation, so before reading this article, we recommend that you first understand Python simulated login. Now, assume that you know how to use requests to simulate logon.

Train of Thought Analysis

The process of sending a private message is that the browser sends an HTTP request to the server. The request message includes the request URL, request Header, and request Body. You only need to clarify the information, then it is easy to use requests to simulate a browser to send private messages.

Open the Chrome browser, find a user, and click send private message to track the network request process of the private message.

First look at the request header information

The request Header contains the cookie logon information. In addition, there is an authorization field, which is used for user authentication and also exists in cookies (to prevent cookie information leakage, I typed a mosaic). This information must be included in requests.

Let's take a look at the request URL and body.

The request URL is the https://www.zhihu.com/api/v4/messages, the request method is POST, the request body

{"Type": "common", "content": "Hello, my name is pythoner", "receiver_hash": "1da75b85900e00adb072e91c56fd9149 "}

The request body is a json string. the type and content are well understood. However, receiver_hash does not know what it is and needs to be further determined. However, you should guess that this is a field similar to the user id.

Now the question is, how can I find the user id through the URL of the user homepage? In order to completely simulate the entire process of private messages, I specially registered a zhihu account.

If you do not have any additional phone number, you can use Google Search "receive sms online", many online provide free online receive sms mobile phone number, I registered small home page: https://www.zhihu.com/people/xiaoxiaodouzi

First try to follow the trumpet, then find it in the list of my attention, move the mouse to the avatar of the trumpet, there is an HTTP network request.

The request url is a https://www.zhihu.com/api/v4/members/xiaoxiaodouzi, followed by the "xiaoxiaodouzi" that corresponds to the back part of the URL of the small home page, which we call url_token.

The data returned by the API is the personal public information of the user.

{  ... "id":"1da75b85900e00adb072e91c56fd9149", "favorite_count":0, "voteup_count":0, "commercial_question_count":0, "url_token":"xiaoxiaodouzi", "type":"people", "avatar_url":"https://pic1.zhimg.com/v2-ca13758626bd7367febde704c66249ec_is.jpg", "is_active":1492224390, "name":"\u6211\u662f\u5c0f\u53f7", "url":"http://www.zhihu.com/api/v4/people/1da75b85900e00adb072e91c56fd9149", "gender":-1 ...}

We can clearly see that there is an id field. As we have guessed before, the receiver_hash field in the private message is the user id.

Code Implementation

At this point, we have clarified the idea of the private message function, and the code implementation is a matter of course.

User Information

To obtain the receiver_hash dictionary required by the private message interface, we first need to obtain the user information, which contains the id value used.

@ Need_logindef user (self, url_token): "Get user information: param url_token: url_token is part of the user home page url. For example: The https://www.zhihu.com/people/xiaoxiaodouzi url_token is xiaoxiaodouzi: return: dict "" response = self. _ session. get (URL. profile (url_token) return response. json ()

Send private message

@ Need_logindef send_message (self, user_id, content): "sends a private message to the specified user: param user_id: User ID: param content: private Message content "data = {" type ":" common "," content ": content," receiver_hash ": user_id} response = self. _ session. post (URL. message (), json = data) data = response. json () if data. get ("error"): self.logger.info ("failed to send private message, % s" % data. get ("error "). get ("message") else: self.logger.info ("sent successfully") return data

The above two methods are put in a class named Zhihu. I only list the key code. @ need_login is a user-authenticated decorator, indicating that the method can be operated only after login. You may find that I did not explicitly specify the Header field in each request, because I put it in the _ init _. py Method for initialization.

def __init__(self): self._session = requests.session() self._session.verify = False self._session.headers = {"Host": "www.zhihu.com",    "Referer": "https://www.zhihu.com/",    'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36'      ' (KHTML, like Gecko) Chrome/56.0.2924.87',    } self._session.cookies = cookiejar.LWPCookieJar(filename=cookie_filename) try: self._session.cookies.load(ignore_discard=True) except: pass

Call execution

From zhihu import Zhihuif _ name _ = '_ main _': zhihu = Zhihu () profile = zhihu. user ("xiaoxiaodouzi") _ id = profile. get ("id") zhihu. send_message (_ id, "Hello, this is a greeting from the Zen of Python ")

After the execution is complete, the account successfully receives the private message I sent.

Finally, we can follow similar ideas to focus on users, thumb ups, and other functions.

Source Code address: https://github.com/lzjun567/zhihu-api

Http://xiazai.jb51.net/201705/yuanma/zhihu-api (jb51.net).rar

Summary

The above is all about this article. I hope this article will help you learn or use python. If you have any questions, please leave a message, thank you for your support.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.