Analysis of GitHub user followers with Python __python

Source: Internet
Author: User
GitHub user Followers analysis

How to analyze a github user's followers.

Weekend preface, with Python analysis of their own GitHub followers users, the statistical results of the following problems analysis

On the GitHub, a user's home page is shown below, mainly to extract the following user information
-User Name
-The location
-Number of user warehouses, stars, followers, following
-Contribution of last year

We need to extract the data in the red box above, the most direct way is to use requests, through the BeautifulSoup to extract the information in the HTML. some detours

There was no way to get user information in the first place, because GitHub has a public rest API v3 can access the specified user's information and has packaged Pygithub for easy invocation. But I experimented with the following questions, so quit using the rest API v3
1. API Request Frequency is limited, cannot use multithreading to obtain the batch user information quickly
2. Do not know is not a small bug, through the API can not get the user contribution of last year contributions tools python 3 : Completely farewell to my py2 beaufulsoup : Extracting data from HTML or XML files Requests : Request Web page multi-process : For faster pyecharts : Beauty's stifling drawing tool steps to get the target user such as Https://github.com/wangshub?page=1&tab=followers all the followers, change page number, traverse all users, extract user key information, save into CSV file, data cleaning, filtering; The use of pyecharts mapping, the location of frequency statistics, experimental results

As of 2018-01-15, my GitHub account has a total of 1214 follower, analysis results are as follows User Location analysis

Excluding the user who did not fill in the location information, the word cloud is as follows after converting the Chinese into pinyin

Users are basically from Beijing, Shanghai, Shenzhen and other land last year, the user contribution degree analysis

If the user is active, must be watching contributions

Can be seen more than the average number of users, last year's contributions are in the 1~50 between the new year to refuel it. One year most of the users are @dragon-yuan, in 2017 there is a full 4,197 contribution, not much said, went to pay attention to a wave. User Followers Analysis

Whoa, Daniel, don't Stop me, I'm going to get some attention.

User Warehouse Quantity Analysis

By crawling the number of users ' warehouses, statistics are as follows

You can see an interesting phenomenon, a small number of people in warehouses more than 1000, open the GitHub homepage, most of them are fork projects, of which the most warehouse users have 13,100 warehouses, It's called @programmerandhacker, and that's how he introduced himself.

I Follow best programmer and Hacker, 
do your want to hacked by them? ^_^ Best 
programmers and hackers are >...
User Stars Analysis

All say it's a good habit to click on Star,

Have to say, GitHub above still a little praise crazy demon, this old iron @chenruibin altogether clicked 10,100 a praise, really good habit ~ user following analysis

Also is @programmerandhacker this person, altogether follow the 19600 user, seriously suspected is not the robot. last

Don't do it, I'm going to write a paper tat, code here https://github.com/wangshub/who_is_following

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.