Viewpoint: Hadoop is not everything that is large data processing

Source: Internet
Author: User
Keywords Large data present expressed provided already

The great thing about cloud computing is that when you do large data processing, you don't have to buy a large number of server clusters in the past, and the rental server handles large numbers to make more use of control costs. As a heavyweight distributed processing open source framework, Hadoop has made a difference in the field of large data processing, and companies want to use Hadoop to plan their own future data processing blueprints. From EMC, Oracle to Microsoft, almost all High-tech vendors have announced their own large data strategy based on Hadoop over the past few months. Today Hadoop has become a hot word for it malls to attract customers.

The growth of Hadoop has been supported by individual developers, start-ups and large businesses. This also gives users the potential to use Hadoop for a long time. However, due to the continuous improvement of the code by different vendors, the product can not operate with each other. The current state of Hadoop is very similar to Android.

Most companies don't really understand big data

The advantage of "big data" is not just scale, but performance, regardless of the number of dimensions of the data set. This is important for direct analysis, such as evaluating a customer's behavior on the site to better understand what support they need or what products to look for, or how the current weather and other conditions affect delivery routes and schedules. This is where server clusters, high-performance file systems, and parallel processing work. In the past, these technologies were too expensive to be used by large enterprises. Today, virtualization and business hardware significantly reduce the cost of using these technologies, making "big data" available to small and medium-sized enterprises.

The smaller companies also have another way to use the "Big Data" analysis-the cloud. Large Data cloud services are emerging, providing the platform and tools for rapid and efficient execution of analysis.

Gemini's CTO Joe Coyle says big data will be a future trend, but many companies don't understand what that means. The most customer inquiries are the concepts of cloud computing and big data.

The industry is making a different sound now that the Hadoop technology is hot. Some manufacturers point to the notion that companies are too hyped up about Hadoop. Building and maintaining the complexity of the Hadoop cluster requires the support of relevant practitioners ' expertise, and the cost of hiring related personnel is expensive. JP Morgan Chase general manager Larry Feinsmith has said that they are not only willing to hire qualified professionals, but also provide a 10% higher than the industry generous treatment.

Not all industries should deploy Hadoop

The manufacturing business itself and product lifecycle management typically create large amounts of relational and non relational data for the ERP and inventory systems in the manufacturing industry. Companies want to have a perfect large data collection and analysis solution, but not all businesses must immediately switch to Hadoop.

The GE Intelligence Platform department has built testing software to collect data from complex manufacturing. The move has also pushed its own Proficy historian 4.5 software to develop faster. Proficy historian promises that the method provided is more reliable than using Hadoop. Brian Courtney of GE's enterprise data management Department says the company's out-of-the-box solutions offer an environment comparable to Hadoop, while the advantage of Hadoop is that they are cheaper and better steered than Hadoop.

GE has a large number of historical data, most of which come from the production and testing stages. The Proficy historian is used to process relational and non relational data generated by product manufacturing and testing like waveforms, and can be leveraged to anticipate possible problems.

For example, when the turbine engine is started, the Proficy historian can detect and view the corresponding electronic signature. What happens when an exception occurs when a normal boot and load test is performed? Have you had a similar situation before? When you find a similar system failure, you can also see how long it took to resolve the failure, so that the manufacturer chooses their priority to exclude errors. Proficy historian can also compare past historical data to see if there are any similar problems in the past, and to generate reports ahead of possible other anomalies. Brian Courtney said.

The new version of the Proficy software is designed to handle more large data. Earlier versions of Proficy support 2 million tabs, and today Proficy has supported up to 15 million tabs.

Amazon deploys HPCC on its cloud computing platform

Amazon has tuned the running software on its cloud computing platform to HPCC. HPCC is the LexisNexis company launched an open source data processing program. The move also allows the HPCC system to replace today's popular Hadoop ideas further.

Armando Escalante, CTO of the HPCC system, said in September that although HPCC is not yet able to attract large companies and governments like Hadoop, it has also prompted HPCC's developers to develop ecologically as if they were hadoop.

There are also some analysts who are bullish on the HPCC system, but it will take a long way for the HPCC community to be as vibrant as the Hadoop community. Now Amazon has brought a good example of HPCC in AWS or the Cloud, HPCC supports AWS's elastic MapReduce. Amazon says the future will bring more surprises.

From a technical point of view, Amazon Web Services now runs only part of the HPCC processing of large data,--thor data refinery Cluster. The platform also includes another way of processing data Roxy Rapid data IBuySpy Cluster. Roxy, as a data warehouse and data query layer, acts like the Apache Hive and HBase.

Both HBase and hive in the Hadoop project have their own language. The HPCC system platform uses the language known as ECL (Enterprise control Language).

(Responsible editor: The good of the Legacy)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.