Data volume: 3,289,329 people.
Data acquisition tool: Distributed Python crawler
Analysis tool: ElasticSearch + Kibana
Analysis angle: geographical location, gender ratio, all kinds of rankings, universities, active level.
Please note:
All of the following analysis results are based on the personal information of the 3 million users I crawl, non-authoritative analysis, for informational purposes only.
Data capture time is July 2017, user data will change over time, so the report has a certain timeliness.
Blue for boys, red for girls. The specific data are:
Male: 1,202,234, accounting for 51.55%.
Female: 1,129,874, accounting for 48.45%.
Where do you know the people are?
Take a look at the whole country (global?) There are places where people are playing to know:
It can be seen that users of secondary school users occupy a majority, other such as product managers, programmers, operations, HR number is also many. Let's take a look at the specific rankings (top 10):
It can be seen that the number of "students" in the user is the most dominant, we remove the "students", to see the serious professional rankings:
Gender distribution in mainstream occupations:
The inner circle of the pie chart above represents the proportion of the leading occupations in the top 10, with the outer ring representing the proportion of men and women in the profession, blue for men and red for females. We then use a histogram to indicate:
After reading the gender distribution of each occupation, we then use a heat force chart to observe the distribution of the major occupations (top five) in each region, the darker the color, the more people represent the occupation in the area:
Here I take the convenience of the display, remove the product manager, you just need to know that the product manager is the largest number of people everywhere ... Do not understand why so many product managers, may be to facilitate the promotion of their products?
We will then list the detailed weighting:
The results shown above are not necessarily accurate, and there may be a large percentage of students who do not fill out their own schools. Only can be seen from the active university users from large to small in turn: Zhejiang University, Wuhan University, Huazhong Science and Technology, Zhongshan University, Peking University, Shanghai Jiaotong University, Fudan University, Nanjing, Sichuan University, Tsinghua.
Since the analysis to the school, we look at the ratio of male and female colleges and universities, hehe.
Found an interesting phenomenon, the majority of colleges and universities are boys playing the know ...
Then look at which colleges and universities are getting the most praise:
Beijing plays the most high-learning universities in order: Peking University, China University of Posts and Telecommunications, Communication University, Renmin College, Tsinghua.
Shanghai's most popular colleges and universities are: Shanghai Jiaotong University, Fudan University, Tongji University, Shanghai universities, Shanghai Finance and Economics University.
Hangzhou to play the most high-tech colleges and universities are: Zhejiang University, Zhejiang Polytechnic University, Hangzhou University of Technology, Zhejiang Universities, Computer science, Zhejiang University, software engineering. Zhejiang University is a heavy user ah ...
Ranked in order are: Wuhan University, Zhejiang University, Sun Yat-sen University, South China Institute of Technology, Peking University, Huazhong University of Science and Technology, Shanghai Jiao Northwest.
Well, the analysis of the university is over, let's take a look at the various rankings of users.
100-bit large V with the highest number of likes
The larger the word morphemes cloud, the more likes you receive:
Let's take a histogram and look at it together:
Shu greatly undisputed to get first place, 360+, horror. Next is pawn, tang deficiency, vczh, fat fat cat, Zhu Yu, Seasee youl, Ze ran, ghost wood knowledge, beans. The top five of the total number of people who have been praised are writers (Shu and Tang), and it seems that the writer is still very popular in knowing how to answer the questions, and the ability to express is an important support for the recognition of the views.
100-bit large V with most followers
The bigger the word cloud, the more followers, see if you are familiar with the Big V? :
Also, we look at the histogram with a graph:
The specific rankings are:
The answer to the question of the most 10-bit large v from large to small in order: Vczh, Li Dong, Zhao Gang, another sock, a universal, M3 small mushrooms, Kun yu, White cat turn wind, Yskin, anus pulled out a chainsaw. Microsoft's job seems to be very busy, see wheel Brother (Vczh) all day brush know ...
Let's add the number of likes that these users have to know, and see if there's any connection between the number of questions answered and the number of likes received:
Let's see how many live they have participated in:
Most of that big V unexpectedly participated in the 1600+ field live, really has the energy and the money, haha.
Incoming groups: 125240963
A senior programmer with a monthly salary of 30k crawls millions of users with Python! and data Analysis!