GitHub user Followers analysis
How to analyze a github user's followers.
Weekend preface, with Python analysis of their own GitHub followers users, the statistical results of the following problems analysis
On the GitHub, a user's home page is shown below, mainly to extract the following user information
-User Name
-The location
-Number of user warehouses, stars, followers, following
-Contribution of last year
We need to extract the data in the red box above, the most direct way is to use requests, through the BeautifulSoup to extract the information in the HTML. some detours
There was no way to get user information in the first place, because GitHub has a public rest API v3 can access the specified user's information and has packaged Pygithub for easy invocation. But I experimented with the following questions, so quit using the rest API v3
1. API Request Frequency is limited, cannot use multithreading to obtain the batch user information quickly
2. Do not know is not a small bug, through the API can not get the user contribution of last year contributions tools python 3 : Completely farewell to my py2 beaufulsoup : Extracting data from HTML or XML files Requests : Request Web page multi-process : For faster pyecharts : Beauty's stifling drawing tool steps to get the target user such as Https://github.com/wangshub?page=1&tab=followers all the followers, change page number, traverse all users, extract user key information, save into CSV file, data cleaning, filtering; The use of pyecharts mapping, the location of frequency statistics, experimental results
As of 2018-01-15, my GitHub account has a total of 1214 follower, analysis results are as follows User Location analysis
Excluding the user who did not fill in the location information, the word cloud is as follows after converting the Chinese into pinyin
Users are basically from Beijing, Shanghai, Shenzhen and other land last year, the user contribution degree analysis
If the user is active, must be watching contributions
Can be seen more than the average number of users, last year's contributions are in the 1~50 between the new year to refuel it. One year most of the users are @dragon-yuan, in 2017 there is a full 4,197 contribution, not much said, went to pay attention to a wave. User Followers Analysis
Whoa, Daniel, don't Stop me, I'm going to get some attention.
User Warehouse Quantity Analysis
By crawling the number of users ' warehouses, statistics are as follows
You can see an interesting phenomenon, a small number of people in warehouses more than 1000, open the GitHub homepage, most of them are fork projects, of which the most warehouse users have 13,100 warehouses, It's called @programmerandhacker, and that's how he introduced himself.
I Follow best programmer and Hacker,
do your want to hacked by them? ^_^ Best
programmers and hackers are >...
User Stars Analysis
All say it's a good habit to click on Star,
Have to say, GitHub above still a little praise crazy demon, this old iron @chenruibin altogether clicked 10,100 a praise, really good habit ~ user following analysis
Also is @programmerandhacker this person, altogether follow the 19600 user, seriously suspected is not the robot. last
Don't do it, I'm going to write a paper tat, code here https://github.com/wangshub/who_is_following