Python-based zhihu_oauth and pythonzhihu_oauth

Source: Internet
Author: User
Tags oauth

Python-based zhihu_oauth and pythonzhihu_oauth

Today, I accidentally found a well-known open-source crawler based on Python named zhihu_oauth. I have read a lot of stars on github, it seems that the document is quite detailed, so I did a little research. I found it useful. Here we will introduce how to use it.

The home page of the Project is https://github.com/7sdream/zhihu-oauth. The author's zhihu homepage is: https://www.zhihu.com/lele/7sdream /.

The document address for the project is: http://zhihu-oauth.readthedocs.io/zh_CN/latest/index.html. The original author has already explained how to use this database in great detail. I will repeat it here to make it a perfect addition. So if you want to learn more about how to use this database, go to the official documentation. I just want to talk about the important points that I think need to be supplemented.

First, install. The author has uploaded the project to pypi, so we can directly install it using pip. According to the author, the project provides better support for Python3 and is currently compatible with Python2. Therefore, you 'd better use python3. directly use pip3 install-U zhihu_oauth to install it.

The first step after installation is to log on. You can log on directly using the following code.

  

1 from zhihu_oauth import ZhihuClient 2 from zhihu_oauth.exception import NeedCaptchaException 3 client = ZhihuClient () 4 user = 'email _ or_phone '5 pwd = 'Password' 6 try: 7 client. login (user, pwd) 8 print (u "login successful! ") 9 forbidden t NeedCaptchaException: # handle the situation where the verification code is needed 10 # Save the verification code and prompt to enter it. Log On again 11 with open('a.gif ', 'wb') as f: 12 f. write (client. get_captcha () 13 captcha = input ('Please input captcha: ') 14 client. login ('email _ or_phone ', 'Password', captcha) 15 16 client. save_token ('token. pkl ') # Save token17 # with token, you can directly load the token file next time you log on. 18 # client. load_token ('filename ')

The code above is to log in directly using the account and password, and finally save the token after login, we can directly use the token to log on next time instead of entering the password every time.

After logging on, you can do a lot of things. For example, the following code can obtain basic information about your account.

1 from _ future _ import print_function # Use the print method of python3 2 from zhihu_oauth import ZhihuClient 3 4 client = ZhihuClient () 5 client. load_token ('token. pkl') # load the token file 6 # display your own information 7 me = client. me () 8 9 # obtain the last five answers 10 for _, answer in zip (range (5), me. answers): 11 print (answer. question. title, answer. voteup_count) 12 13 print ('----------') 14 15 # obtain the top five answers for the likes: 16 for _, answer in zip (range (5), me. answers. order_by ('votenum'): 17 print (answer. question. title, answer. voteup_count) 18 19 print ('----------') 20 21 # obtain the last five questions 22 for _, question in zip (range (5), me. questions): 23 print (question. title, question. answer_count) 24 25 print ('----------') 26 27 # obtain the five most recently published articles 28 for _, article in zip (range (5), me. articles): 29 print (article. title, article. voteup_count)

Of course, we can do more than that. For example, if we know the url address or question id of a question, we can obtain the number of answers to this question, the author's information and a series of detailed information. Developers think very well. Generally, all the common information is included. I will not post the specific code. You can refer to the official documentation on your own.

A small tips: Because this library has many classes, such as the class that obtains the author information and the class that obtains the article information, etc. Each class has many methods. I went to the official document and I did not list all the attributes of some classes. How can I view all the attributes of this class? In fact, it is very simple. You only need to use the dir function of python. You can use dir (object) to view all the attributes of the object class (or object. For example, if we have an answer Class Object, using dir (answer) will return the list of all attributes of the answer object. Apart from the default attributes, we can find the attributes we need for this class, which is very convenient. (The following shows all the attributes of the collection class)

['_ Class _', '_ delattr _', '_ dict _', '_ doc __', '_ format _', '_ getattribute _', '_ hash _', '_ init _', '_ module __', '_ new _', '_ reduce _', '_ performance_ex _', '_ repr _', '_ setattr __', '_ sizeof _', '_ str _', '_ subclasshook _', '_ weakref _', '_ build_data ', '_ build_params', '_ build_url', '_ cache',' _ data', '_ get_data', '_ id',' _ method', '_ refresh_times ', '_ session', 'answer _ count', 'answers', 'articles', 'comment _ count', 'comments', 'contents', 'created _ time ', 'creator', 'description', 'follower _ count', 'followers', 'id', 'is _ public', 'ure _ data', 'refresh ', 'title', 'updated _ time']

Finally, I used this class to capture all the images in the answers to a certain question (capture the beauty map, hahahahaha) and used less than 30 lines of code (remove comments ). Share with you.

1 #! /Usr/bin/env python 2 #-*-coding: UTF-8-*-3 # @ Time: 4 # @ Author: Lyrichu 5 # @ Email: 919987476@qq.com 6 # @ File: save_images.py 7''' 8 @ Description: save the picture 9 '''10 from _ ure _ import print_function for all answers to a question # Use the print method of python3 11 from zhihu_oauth import ZhihuClient12 import re13 import os14 import urllib15 16 client = ZhihuClient () 17 # log on to the 18 client. load_token ('token. pkl ') # Load token file 19 id = 24400664 # https://www.zhihu.com/question/24400664 (nice looking is a kind of experience) 20 question = client. question (id) 21 print (u "question:", question. title) 22 print (u "number of answers:", question. answer_count) 23 # create a folder for storing images 24 OS. mkdir (question. title + u "(image)") 25 path = question. title + u "(image)" 26 index = 1 # image No. 27 for answer in question. answers: 28 content = answer. content # answer 29 re_compile = re. compile (R'  ') 30 img_lists = re. findall (re_compile, content) 31 if (img_lists): 32 for img in img_lists: 33 img_url = img [0] # image url34 urllib. urlretrieve (img_url, path + u "/mongod.jpg" % index) 35 print (u "saved the % d image" % index) 36 index + = 1

If you write it on your own, you will not be able to get all the answers by directly capturing the parsing Web page. Therefore, you can only crack the zhihu api, Which is troublesome. It is much easier to use this ready-made wheel. In the future, if you want to appreciate the beauty you know, you don't have to worry about it anymore.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.