Big Data 2.0 times feature-faster data processing

Source: Internet
Author: User
Keywords Large data processing faster already

With the development of the industry, its business opportunities show a variety of trends. "Big Data" was used in the 2014 vocabulary, but in fact, due to the lack of data, large data cleaning and analysis capacity, as well as data visualization bottlenecks, and other issues, "Big data" has been unable to delay landing.

Recently, with the development of infrastructure, it means that the development of large data has come to a new critical point. Gagan Mehra, a system software vendor Software AG, described his understanding of the development of large data in the VentureBeat Web site, and he believes that faster data processing, more reliable data quality, and a more segmented application market are important features of the 2.0 era of big data.

Faster data processing speed

As the exponential growth of data, the need for rapid data analysis has become more urgent than ever before. Almost every big data maker wants to sell a product that is faster than another. The new Hadoop 2.0/yarn, which is released by Hadoop, can analyze data almost in real time. The next generation of large data to compute the traction frame of the Apache Spark, which is 100 times times faster than Hadoop.

Andreessen Horowitz, a venture capitalist in Silicon Valley, has taken a 14 million-dollar price for a start-up databricks with Apache Spark as its core business. Not long ago, Amazon also online real-time streaming data service kinesis to help companies without data processing capabilities to solve the problem.

Many analysis vendors have recognized the importance of data processing speed and have created products that can process TB data per second. Sensor data analysis, the rapid development of Internet of things in industry and consumer-level market, drive this change. For example, a company's sensors can generate hundreds of of events per second, real-time processing of these data is very difficult. Especially when the real-time processing of the sensor data, surge to 5TB a day, speed, becomes a particularly critical indicator.

At the same time, although the cost of data storage has leinian down, the cost of data storage is a huge expense. Some businesses tend to save data that filters out noise compared to storing full data streams.

Intelligent cleaning of "junk data"

As the amount of data that is hard to count continues to proliferate exponentially, the strengthening of data quality is on the agenda of many data suppliers. In other words, in the face of huge data, even if the computer can efficiently handle them, but a lot of useless "garbage" data will only bring the system burden, and add storage, host and other equipment costs. This requires the process of data processing, according to specific rules and parameters, to the flow of data into the "cleaning" and analysis, and automatically decide which data to deal with, all these no longer need to manually intervene.

In such an environment, if you choose a bad data, it will be like a virus, may cause continuous error decision-making, and even the enterprise suffered economic losses. An example is the use of algorithms to carry out stock trading, in milliseconds to count the stock market, any small error, it is possible to cause a huge loss.

As a result, data quality has become one of the most important parameters for service level agreements (agreements). Vendors unable to block low-quality data will be blacklisted by industry and face severe economic penalties.

The business-to-business industry is the early adopters of data quality, and they attach great importance to the quality of data to maintain the stability of business operations. Even many companies plan to deploy real-time warning systems for data quality that are sent to the Commissioners responsible for the problem, providing solutions to the problem.

Machine learning is another area where data quality needs to be ensured. The machine learning system is deployed in a closed-loop ecosystem, and the original data quality rules are refined through pattern analysis and other data analysis techniques. and high quality data, can ensure the machine to conduct the correct behavior pattern analysis.

More and more basic applications

The changes brought about by big data make everyone want to use it, but the technical threshold makes many people have to act as spectators. And the application will help people to overcome this difficulty. Over the next few years, we'll see thousands of professional applications that address a vertical area to meet big data challenges from all walks of life.

Currently, data analysis companies that have achieved little success include eharmony, Roambi, Climate Corporation and so on. In the future, even many small businesses can benefit from the use of large data analysis without relying on specific infrastructure or hiring professional data scientists.

For example, some applications will collect associated customer data from a variety of sources to better understand customer needs. So that enterprises can for specific target customers, provide specific needs of products, more targeted to make money. When these applications go into people's Daily beer and skittles, health care and other fields, life will be better.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.