Hadoop: From the fledgling little elephant to the industry giant
Source: Internet
Author: User
KeywordsThatched provide large data change body at the same time
With its low cost and unprecedented high scalability, Hadoop has been recognized as a new generation of large data-processing platforms. Like the SQL structured Query Language 30 years ago, Hadoop is bringing a new round of data revolutions. Now that Hadoop has turned from a fledgling elephant to an industry giant, Hadoop still needs to be perfected.
The Hadoop framework based on Java language is actually a distributed processing large data platform, which includes software and many subprojects. Hadoop has become the center of the Big Data revolution in the last decade. MapReduce as the core of Hadoop is a processing of large and oversized datasets (terabytes of data). Including streaming data generated by network clicks, log files, social networks, etc., and generate the relevant programming model for execution. The main idea is to draw lessons from the functional programming language, and it also includes the characteristics from the vector programming language.
Internet giant Yahoo! The pioneer researcher in the Hadoop framework has created Hadoop as a highly successful technology in 6 years. But it is still not perfect in some respects compared with sql,hadoop. This leads directly to the focus of today's attention on Hadoop suppliers. Including Amazon, Cloudera and other companies bring many innovations and provide powerful tools. Cloudera's CHD3 includes numerous additional software that can help manage and run complex tasks on Hadoop, such as Apache Mahout, Flume, Sqoop, Pig, Oozie, Hive, HBase, zookeeper, Whirr. At the same time Cloudera is currently the largest provider of enterprise Hadoop technical support and training of manufacturers. Amazon, a company that runs Hadoop in a public cloud earlier, offers a vast amount of data computing services based on MapReduce flexible computing.
But data processing is only a part of large data processing, and the organization ultimately wants to get the valuable data after analysis. Business Intelligence and data analysis vendors such as Datameer, HADAPT and Karmasphere are essential.
The most obvious sign of the value of Hadoop in 2011 is that the five database management software vendors, EMC, IBM, Informatica, Microsoft, and Oracle are all involved in Hadoop's embrace. EMC worked with MAPR, while Microsoft and Oracle collaborated with Hortonworks and Cloudera respectively. and EMC and Oracle have introduced Hadoop proprietary devices. Now let's take a look at Hadoop's capture of the company's heart in large data fields.
Amazon based MapReduce Services
Amazon launched the EC2 (elastic Compute Cloud) service based on Hadoop MapReduce as early as 2009. So Amazon is confident in responding to user apps and needs. EC2 services based on MapReduce have withstood the test, whether they are small or medium sized enterprises or very large organizations. Also, AWS (Amazon WEB Service) includes the Amazon S3 (Simple storage service). Amazon S3 delivers high scalability, reliability, high availability, and very low storage costs. Use AWS to efficiently handle data-intensive tasks such as Web indexing, data mining, log file analysis, machine learning, and academic research on scientific and biological information.
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.