Big Data has now become the focus of attention, many companies are struggling to put their own data to use, hoping to provide support for important decisions. While big data hype and hype are raging, 92% of companies remain neutral, planning to start at "the right time" or to say they don't intend to touch big data projects. In companies that have already hands-on big data projects, most fail and often fall into the same trap.
The key to success with large data projects is to build an iterative approach that encourages existing employees to participate in and use them to learn and accumulate experience in a series of unimportant failures.
Herd mentality
Big data is definitely a transformational and great technological achievement. According to a Gartner survey, 64% of respondents in 2013 said they had bought or were planning to invest in large data systems, a ratio higher than 58% in the 2012 survey. More and more companies are exploring their own data, trying to use the information contained therein to minimize customer churn, analyze financial risks and improve the customer experience.
Of the 64% respondents with large data ideas, 30% have already invested in big data technology, 19% are planning to invest in the coming year and another 15% are planning to invest in the next two years. But less than 8% of all Gartner's 720-bit respondents have actually deployed large data technology solutions.
This is a bad result, but the reason for the project's failure is obviously worse: most companies have no idea what they should do after entering the big data field.
No wonder so many companies are now paying a handsome salary figure to drum up and hire data scientists, whose average income has reached $123,000 a year.
Eight reasons to cause failure
Because many enterprises in the process of exploring their own data is completely in the wrong hit, having realised this, they have decided to seek help from professionals who can bring more predictable solutions (including the idea that data scientists can miraculously defuse the real problems they face, even with much more exaggerated expectations). Gartnerwngr Svetlana Sicular summarizes eight common causes of failure for large data projects, namely:
• Management resistance. Although the data contains a large amount of important information, fortune knowledge found that 62% of corporate leaders still tend to believe in their intuition, and more than 61% of respondents believe that the leadership's practical insight in the decision-making process has higher than the data analysis of the priority reference value.
• Choose the wrong way to use it. Companies tend to make two mistakes, either by building up a large data project that is too radical or at their disposal, or by trying to use traditional data technology to deal with big data problems. In either case, the project is likely to get into trouble.
• Ask the wrong question. Data science is very complex, which includes expertise (need to know more about the actual business situation in banking, retail or other industries), mathematical and statistical experience, and programming skills. The data scientists employed by many enterprises only understand mathematics and programming knowledge, but lack the most important skill components: knowledge of related industries. Sicular's view is right, and she says it's best to start looking for data scientists from within the company, because "learning Hadoop is easier than learning about the industry."
• Lack of necessary skill sets. This reason is closely related to the question of "making a mistake". Many big data projects get bogged down and eventually fail because they don't have the necessary skills. It technicians are often responsible for such projects-and they are often unable to present the data with the right questions to guide the decision.
• Other contingencies are encountered outside of large data technology. Data analysis is just one of the components of large data projects, and the ability to access and process data is equally important. In addition, there are often overlooked factors such as network transmission capacity limitation and personnel training, and so on.
· Conflict with the enterprise strategy. To make a big data project a success, you have to get rid of it as a single "project" and really think of it as the core way to use data for your business. The problem is that the value or strategic goals of other sectors are likely to be higher in priority than big data, which often makes us powerful everywhere.
• Large data islands. Big data providers love to talk about "data lakes" or "data hubs," but in fact many businesses are built as "data puddles," with clear boundaries between puddles-such as marketing data puddles and manufacturing data puddles. It needs to be emphasized that large data can truly play its own value only if the gap between different sectors is minimized and the data streams are aggregated.
• Avoid problems. Sometimes we can be sure or suspect that the data will force us to make some operations that we would have wanted to avoid, such as the pharmaceutical industry's rejection of the affective analysis mechanism because they don't want to report bad side effects to the FDA and assume the legal responsibility that comes with it.
In this list of reasons, you may have found a common theme: No matter how highly we look at the data itself, there are human factors involved. Even as we strive to gain full control of the data, large data processing processes are ultimately managed by people, including many initial decisions-such as choosing which data to collect and analyze, what questions to analyze conclusions, and so on.
Implementing Innovation through Iteration
Since many companies seem simply unable to establish their own large data projects, and most large data projects often end up failing, it is necessary to introduce iterative mechanisms into large data. This does not force companies to pay a lot of money to consulting firms or suppliers, and it is best to build a free data experiment program involving in-house employees.
Given that almost all the major data technologies are open source, it is entirely feasible to build a "small initial size that can quickly identify problems". More importantly, many platforms can work as quickly and cheaply as cloud services, further reducing the amount of money invested in project experimentation and error detection.
The big data focus is on making the right questions, which is why it is so important for employees to be involved in the project. But even with a superior knowledge of the industry, it is not possible to start the process of presenting the problem at all, and the enterprise is still unable to collect the correct data. Such issues should also be included in the anticipation and prepared accordingly.
The key to solving the problem is to use a flexible and open data infrastructure to ensure that employees are allowed to continue to adjust their actual programs until their efforts get the desired feedback. In this way, the enterprise can eliminate the fear and ultimately to the successful use of overlapping weapons to the success of large data.