Sahara's successful graduation will accelerate the integration of OpenStack and Hadoop
Source: Internet
Author: User
KeywordsLarge data provide accelerate can
OpenStack Sahara (formerly: Savanna) The head of the project Sergey Lukjanov officially announced yesterday, Sahara from the OpenStack incubation project successfully graduated, Will begin as one of the OpenStack core projects from the next version of OpenStack Juno. Sahara was in 2013 by the leading Apache Hadoop contributor Hortonworks Company, the largest OpenStack system Integrator Mirantis Company, As well as the world's leading open source solution and the latest version of OpenStack's largest contribution to the joint launch of Red Hat Company, is committed to the common OpenStack on the implementation of Apache Hadoop, so that OpenStack users can easily supply and manage the flexible Hadoop cluster, Accelerate the process of developing and deploying Hadoop on OpenStack.
Apache Hadoop is an implementation of MapReduce technology, which is widely adopted by various industries and has become the industry standard of large data processing. The Sahara project is designed to provide OpenStack users with a simple, fast way to deploy and manage the Hadoop cluster, similar to Amazon elastic MapReduce (EMR) services.
The architecture of the Sahara project is as follows:
horizon--provides the GUI to use all Sahara features.
keystone--authenticates the user and provides a security token to communicate with OpenStack to assign specific OpenStack permissions to the user.
nova--configures virtual machines for the Hadoop cluster.
glance--is used to store Hadoop virtual machine mirrors, each of which contains installed OS and Hadoop, and preinstalled Hadoop should give us the convenience of node placement.
swift--can be used as a pre storage for Hadoop operations.
Users need to provide Sahara with information to build clusters, such as the Hadoop version, cluster topology, node hardware details, and some other information. After the user provides these parameters, Sahara will help the user set up the cluster within a few minutes, as well as help the user extend the cluster (add or remove the work node) as required.
Cloud Computing provides an infrastructure platform for large data applications to run on this platform, one of the most widely recognized and efficient ways to handle large data. Use the Sahara scheme to effectively meet the following usage scenarios:
1. Rapid deployment of Hadoop clusters in the OpenStack cloud environment;
2. Fuller use of underutilized computing resources in the general OpenStack IaaS Cloud environment;
3. Similar to the Amazon EMR, providing data analysis as a service for temporary or abrupt data analysis tasks (Analytics as a)
The integration of OpenStack and Hadoop not only maximizes the resource utilization of the server, but also greatly reduces the entry threshold for large data processing. Predictably, as a bridge between cloud computing and large data, Sahara will drive the integration of the OpenStack cloud platform and Hadoop, step into the big data-processing market, and transform data into commercial value faster with cloud computing platforms and large processing technologies.
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.