Data scientists are not required

Source: Internet
Author: User
Keywords Big data data scientist
Tags analysis analysts big data business computer course data data model



Data scientists are described by the media as the sexiest career in the 21st century and as a superstar of the IT industry who are struggling to make big data, but in fact there are no data scientists Under the same enterprise can play big data.

Up to now, as long as big data is mentioned, the most talked about is the shortage of talent, data scientists have become the most sexy career in the 21st century and so on. The report by a Harvard business review and consulting firm, also said that it is natural for data scientists themselves to be labeled as such.

However, this argument also discourages many companies preparing to think big data strategy in the face of big data opportunities. So, if there is a lack of data scientists, big data companies really have fun to play? In fact, not necessarily.

Here, of course, not to say that data scientists are not important. On the contrary, the data scientist career in the era of big data is very important. In a business, the work of a data scientist is in fact a link between the IT technology of the business and the industry expertise in which the business is located. The talent at this intersection of knowledge is indeed scarce in itself, and even if big data continues to grow dramatically, there are still a handful of talented people with such knowledge. However, just as we were in the early days of the computer industry, we can not say that the growth of the industry was constrained by too few people like Jobs and Gates. Nowadays, Kodak Street guys can use PS repair portraits, and we do not need to require Kodak's young man with the ability to write image processing software.

The big data age is the same. Large companies such as Google, Twitter and Facebook can afford the sophisticated data scientists. Small businesses can also have their own ways to make better use of data. In the following, taking an e-commerce company as an example, in conjunction with the work of a data scientist, take a look at how data can be carried out within an organization's existing capabilities.

The work of data scientists, broadly divided into three areas:

First, the establishment of data architecture, the second, the establishment of data model, third, data analysis.

Let's take a look at how big data companies that can not afford to hire or find data scientists can get started:

Data structure of the building:

First, identify the business needs of the data point. For most business enterprises, in fact, each business manager can tell you that the data they need is the user behavior data, such as the user's purchase behavior, the user's reaction to the promotion or advertisement, the user's social information, etc. Basically, each of these types of information can be categorized relatively easily.

The key here is to try to limit the range of data that is needed so that you can set up some simple data entry templates to streamline data acquisition and data collation. Here you can use a number of open source tools, such as Hadoop, Hbase, Hive, Pig, the various types of data integration. The 2/8 principle is generally applicable, that is, 80% of the operational support needs can come from 20% of the data. For businesses, collaboration between IT technicians and business experts, with the help of a few external consultants, should build an architecture that works.

The establishment of data model

Another part of the data scientist's work is the establishment of a data model. These models may be either descriptive models or predictive models. This part of the work, but also the part often deified by data scientists. In fact, this part of the work, such as the recommended system, user personalization system and so on. A great deal of work has been done by data scientists in extracting the "characteristics" of the data, selecting the appropriate models, entering them into the model, waiting for the model to output results, revalidating, and adjusting the cycle of features. This part of the work, you need first, familiar with all kinds of statistical models or machine learning model. Second, and more importantly, is the knowledge of the industry. For example, a recommendation system, the most important thing is to extract the characteristics of the user to extract the characteristics of the goods. If people do not know the modeling knowledge of the industry, then the model will be very large and complex, it may not be accurate. Here, industry experts, though not necessarily proficient at modeling, feel their market often is the key to choosing the right eigenvalue.

Therefore, for e-commerce enterprises, recruiting a few statistics employees (or outsourcing), together with the experts within the enterprise, can also build some basic models that suit the needs of enterprises. Perhaps not as accurate as Google or Facebook, but for most businesses, it's good enough. It is also a solution to a situation where no suitable data scientist (in fact, proficient in the industry and yet proficient in modeling) was found.

data analysis

The essence of data analysis is to turn "data" into "information" and discover something valuable to business operations. This is in fact the same with any scientific or engineering observation-induction-correlation-analysis-validation method. From this perspective, industry expertise is even more important in data analysis.

Even if you give data scientists the data of the European Large Hadron Collider, he can not find the "God Particle."

Many in the country will relish the example of Target, a US department store company that uses data analysis to push baby products to pregnant teenage girls, and many data analysts or data scientists are intentionally or unintentionally misleading when it comes to such examples. In fact, without professional knowledge of users and products, data analysis or data model alone is difficult to achieve. In fact, any machine-generated model must be artificially adjusted to a certain degree on the feedback path if it is practical.

In the field of data analysis, there are a lot of analytical tools. However, most of these tools are still quite complicated now. You need specialists like data scientists or data analysts. Due to the degree of refinement of the enterprise is generally not high. Data analysts or analysts at BI are inherently scarce, not to mention those who are proficient in industry-specific knowledge and have the ability to use data analysis tools. A solution to this problem is to make the commonly used analysis templated as much as possible and simplify the data processing as much as possible. Try to use such a simple public mass analysis tools. After all, the purpose of data analysis is to run the service. Simple tools that have advantages in terms of usage, sharing and communication. Of course, such a solution is not perfect, but if you enable experts with rich industry experience to make up for the lack of data analysis tools with industry experience, you can be regarded as a business in the absence of a data scientist from Ways to benefit from data analysis.

In the era of big data, of course, the importance of data scientists is beyond doubt. However, just like Web content management systems, large websites can hire top-level engineers to build systems. Small businesses can also use a system like WordPress to meet their own needs.

In this era of big data, where talent is scarce, companies can take advantage of existing tools and their own industry-specific expertise to adopt appropriate strategies that also benefit from data and data analytics.

The good news for those who are preparing big data for gold bullion is that start-ups like ClearStory are working to make big data easy to use and visualize for businesses that can not afford a high-level data scientist and for non-IT Department staff also have access to big data. As the computer into the windows era, ordinary users no longer need to operate the computer one by one memory of the complicated DOS command line.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.