Uncover LinkedIn Data How scientists work

Source: Internet
Author: User
Keywords Data scientists these jobs secrets social networks
Absrtact: In Internet companies, LinkedIn is a famous "slow company," but LinkedIn is also the most successful social network, with user quality, advertising value being the industry leader, and the secret is that LinkedIn has an efficient team of data scientists. Here's a look at LinkedIn's chief data scientist, Manu Sharma, about LinkedIn.

As a social network, LinkedIn is not the biggest or the fastest growing. Founded in 2003, LinkedIn took 500 days to reach 1 million users. However, as the world's largest professional social network, LinkedIn has a full stamina. Today, LinkedIn adds 1 million users every 6 days. An average of two new users per second. Each year, LinkedIn's user search volume reaches 4.2 billion. LinkedIn's data Analysis team analyzes 200TB of data every day to better understand users.

Why people are so concerned about statistics and data, and why data scientists have become the sexiest job, recently, at the tie summit, LinkedIn's chief data scientist, Manu Sharma, was interviewed to uncover the work of LinkedIn's data analysis. Here is the interview content:

Q: Can you tell me about LinkedIn's data science?

A: LinkedIn is a user's professional social network, and in this network, if people want to find you but can't find it then you may lose a chance. Therefore, it is important for the user to keep the status and update the information constantly. LinkedIn's business is based on an analysis of these data. In order to achieve the goal of real-time data processing, we developed our own unique algorithm called Metropolis. It can process 1 billion of data in real time every day. Open Source Solutions Voldemort, Kafka, and Zoie functions.

Data scientists need to have curiosity and intuition. The question they need to think about is: what can I do with this data? What kind of questions do I need to ask? What does this data tell me? They also need enough intuition to understand the limitations of the methods they employ. The work of data scientists includes collecting data, collating data, establishing correct models, testing models, and having certain programming capabilities. A data scientist needs these skills, which are the skills a start-up needs to build a team of data scientists.

Q: What are some of the key aspects of LinkedIn data application?

A: There are three major data applications in LinkedIn:

1. Developing innovative data products

2. Finding trends and opportunities from internal data

3. Promote business growth

For example, an "inference algorithm" is used to speculate on information based on the user's data. This is especially important for future product design. For example, LinkedIn uses the "inference algorithm" to launch the "people you might know" feature. This helps to increase user stickiness and improve word-of-mouth communication. LinkedIn was the first to launch this feature. Now, this feature has become a necessary feature of social networking.

In addition, through text extraction and text analysis, we build a key dictionary of skills through the user's description of the skill in the data. By clustering algorithm, we can produce many interesting discoveries to help us improve our service or launch our products.

In addition, through the analysis of user data in various industries, we can also make some predictions about the industry or the whole economy. One advantage of this, for example, is a job cut in one industry, or an increase in recruitment plans in some sectors, and so on, that the data is not from the questionnaire, which is the user's real behavioral data. So, in the U.S. president's economic policy report, the data will be used. These data are also important to the development of the enterprise.

Q: What are the principles of best practices in data analysis?

Secretary:

1. The greater the amount of data, the better

2. Raw data is better than processing data

3. Data standards and data quality are very important

4. Simple models are better than complex models

5. Modeling is a constant trial and error.

(Responsible editor: Lu Guang)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.