Social network social networking analysis

Source: Internet
Author: User

Social network social networking analysis

One: What is sna- Social network analysis

What is the power of social network analysis? I'd like to explain in a few cases.

case 1 / sna - Devote energy to the right person; used to deal with organizational structure; used to lobby for support --

Case 2: give a real example, the Goose factory just launched a friend's circle when I was very good impression of this product, because it gave me recommended friends have been for many years have no contact, not deliberately mentioned the name of " Friends ". Including Renren's recommended friends are also very accurate. Behind these products is the use of sna- friend Friend is also my friend, the enemy's friend is my enemy, the enemy is my friend, friend's enemy is my enemy.

These two cases are visual impressions of social network analysis, where nodes in the network are people. If the SNA is used only on people, it is too narrow. The same thought can be used entirely on objects. For example:

Case3:WatercressFMis also a product that I like very much---Encounter the music you like. Some songs are my impression of a certain period of time, some of the imprint is fresh, some of the imprint gradually blurred. From time to time, in the watercressFMIt's a surprise to encounter these or clear or vague impressions. Why WatercressFMcan you do that? is what rhythm it does to the song/Melody/style/did you classify the lyrics? It would be foolish and naïve if you thought so. Here'sSNAEach song is the network of each node, and you like or no longer play to give you heard the songs between the enhanced/weakened the connection.

Through the above three cases, you can have a preliminary understanding of the SNA.

Two: my Circle of Friends

There are generally two ways to get a friend Circle: 1) social apps / social networking sites such as everyone / Weibo / 2 " communication record Span style= "' font-family:" "Times New Roman"; >- phone mail / SMS. The latter data are mastered by the corresponding operator, the former data can be opened from the application of the api

I used the crawler to get everyone's circle of friends. Grab two layers of friends, my friends, and my friends ' friends. In fact, this layer can be set by itself, with recursive function is easy to achieve. The only consumption is run time and storage. Even with a two-story circle of Friends, I ran with my laptop for a minute.

Then the networkx came on. After a pass operation, the following results are obtained:

1) Two-tier friends circle

This is a diagram of a 7169 friend relationship. Of course the most central point is my own. You can see that the surrounding lap is a relatively isolated " cloud ", which is because I have only scratched two layers of reason.

More than 7,000 friends, so many layers of relationships, how to analyze? Don't worry, social network analysis is not a new field, see America literature said this field from the 670 's, but it is almost ten years fire up. So there's a lot of ready-made algorithms to basically fix most of your needs.

The basic analysis for more than 7,000 friends is as follows:

---------------2014-06-08 Total analysis ----------------

The social network has a total of 7169 Friends

number of friends in top ten

Xu Xiwen --909

2-- Liu Shang --607

3-- Li Chao --505

4--colipso--405

5-- Lu Xiufang --343

6-- Xin Ting --336

7-- Wang --312

8-- Wang Hui --258

9-- Sun Hao --255

10-- Yang Zixu --248

--------2014-06-08 the Welcome Index Analysis ( based on Closenes centrality)------------

Top Ten Friends of the Popular index

1--colipso--0.51

2-- Dixon --0.50

3-- Xu Xiwen --0.40

4-- Carina --0.40

5-- Lofeng --0.39

6-- Zhang Wei --0.39

7-- Chen Xin --0.39

8-- King Yun Jie --0.39

9-- Sun Feng --0.39

10-- Zhang Ning --0.38

---------2014-06-08 at The beginning of the Pivot Index Analysis ( based on betweenness centrality algorithm )----------

The top ten friends in the hub node are

Xu Xiwen --0.21

2--colipso--0.20

3-- Liu Shang --0.14

4-- Dixon --0.12

5-- Li Chao --0.11

6-- Lu Xiufang --0.08

7-- Xin Ting --0.08

8-- Wang --0.08

9-- Wang Hui --0.06

10-- Chen Xin --0.05

----------2014-06-08 in seconds, starting with the behind-the- scenes Index Analysis ( based on eigenvector centrality algorithm )---------

Not defined for multigraphs.

-------2014-06-08 , starting from Google PageRank Index Analysis ( based on the Google PageRank algorithm )-------

PageRank () not defined for graphs with multiedges.

Explain some of the words:

Hub: A person at the same time belong to two not want to do the group, then this man is in the hub position.

Behind the scenes: As the name implies, a person does not connect with most people, only with the key person to contact, through the key people to influence the group.

In the analysis, the latter two algorithms are not applicable in this particular analysis because the social networks constructed by the underlying data are non-network.

2) Core Communication Circle

It's impossible to know all the basic people,Networkx also provides an algorithm to analyze someone's core circle of communication, or take me for example:

---------------2014-06-08 Total analysis ----------------

The social network has a total of 502 Friends

Other analyses are implemented using the same module, which is the same as above, and is not repeated.

3) Outside the circle

The above is only the macro-level results, from the micro-level, in the large group also always have a small circle, the circle of people closer, with a common topic, generally outside the circle of people have a certain degree of exclusion, and the people in the circle of trust will be very high, is so-called circle outside the circle.

For a subject that has developed for nearly half a century, or that sentence, something you think of has long been thought of.

For example, in my friends circle:

The first coterie is :

Chi Wenying Shing Zheng Sun Hao Chen Xin Zhang Chenxing Lubevin

This is a bunch of my high school classmates.

4) Shortest Path

There are already very mature algorithms to find the shortest path between two nodes in a social network. This is the so-called six degree space. That is, if I want to know xxx, then should find the least intermediary to achieve the purpose?

Extrapolate, if it is a network composed of various books, the book is a node, if a person read two books, then the two books have a connection. The question is, How do you recommend a third book to a person who has read two books on a variety of novel apps? Other books on the shortest path of the two books, some people will ask, this is not the two books have been connected, the path is not the shortest? This involves the weight of the path problem, with the weight, the direct connection is not necessarily the shortest. How are weights obtained? well,it depands.

Since I've only scratched two layers of friends, so, the shortest path will not exceed 2.

Casually find a: COLIPSO---rain---i want to know fan, then find the rain on the right.

5) three-person line

For any three people, you can have the following five relationships:

For example, 012C This type, as a middleman, can not introduce the other two people know that?

Look at the number of these three types in my circle of contacts:

201 types of three nodes have 94109

021C types of three nodes have 0

021D types of three nodes have 0

There are 0 types of three nodes

120U types of three nodes have 0

030C types of three nodes have 0

003 Types of three nodes have 19747819

the three-node type has a 3605

three nodes of the 012 type have 0

021U types of three nodes have 0

120D types of three nodes have 0

102 Types of three nodes have 1112967

111U types of three nodes have 0

030T types of three nodes have 0

120C types of three nodes have 0

111D types of three nodes have 0

Of course, because I only caught two layers of communication circle, can be said to be the core of the communication circle, so many three node types are not present, if the number of layers fetched more, the results will be more significant.

is still extrapolate, the nodes in the network, whether people or things, for Each of the structure of each of the two can be formulated a certain strategy to achieve a certain purpose. The above analysis has completed the first step of action to identify the target.

Three: Some of the messy ideas

1) Traditional statistics and modern analysis

Recent studies of R and social network analysis have revealed that there are some differences between traditional methods of statistical analysis and modern methods of analysis.

The traditional statistical analysis method originated from the century, whether the point estimation / interval estimation / hypothesis test is dependent on a certain distribution hypothesis, not to mention Bayesian statistics, A large number of academic studies have worked out the method of examining the whole under small samples, with the aim of trying to reduce the computational capacity. But the problem is that the current environment / User Preferences change very quickly, that is, the distribution changes quickly. With the traditional statistical method in the analysis of the population changes, the analysis of the parameters of the changes still have some limitations.

and modern analysis methods whether Monte Carlo simulation or social network analysis are based on dense computing, tube you what distribution, simulation of not enough, then analog 10000 times, 100000 times. According to the law of large numbers, the results ran out, sorta.

2) Tools

All of the above analysis is done using python and networkx modules. Python 's flexible data structure, lots of open source modules (numpy/scipy/matplotlib/networkx/webpy , etc. ) can be said to be a home travel, data analysis of the necessary medicine. The clear language specification also avoids the brace storm. I appreciate it.

The bottleneck of networkx analysis scale lies in memory / Storage, secondly, the rationality of the algorithm. It is easy to handle the number of nodes within ten million. If the number of nodes is tens or even billions, then you have to design well.

3) Analytical value

Analyze the value that can be generated one is used for decision making / one used in the product. the right error in decision-making can be seen in the medium and long term. Products are more straightforward, the analytical value is quickly seen in the number of users / opinions.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.