Basic concepts and architecture of Sahara

Last Update:2014-12-10 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Statement:

This blog welcome reprint, but please keep the original author information, and please specify the source!

Guo Deqing

Team: Huawei Hangzhou OpenStack Team

The Sahara is designed to provide users with the ability to simply deploy Hadoop clusters, such as through a simple configuration: Hadoop version, cluster structure, node hardware information, and more. After the user has provided these parameters, Sahara quickly deploys the Hadoop cluster. It also supports the expansion and reduction of the cluster.

Its application scenarios include:

1) provides the ability to quickly configure and deploy Hadoop clusters on OpenStack.

2) Leverage the computing power of the OpenStack IaaS layer.

3) Provide analytics-as-a-service data analytics business, a bit like Amazon EMR.

The main features of Sahara include:

1) Sahara as a component of OpenStack.

2) managed by the dashboard call Rest API via OpenStack.

3) support for different Hadoop versions

4) Configurable Hadoop configuration template.

The Sahara class is OpenStack's Horizon (GUI), Keystone (providing authentication), Nova (to create a Hadoop cluster virtual machine), The Heat (Sahara can be configured to use Heat to coordinate the services required for a Hadoop cluster), Glance (for storing Hadoop virtual machine images), Swift (which can be used for data stored in Hadoop task processing), Cinder (for providing block storage), Neutron (providing network services), Ceilometer (for collecting information from the cluster for metering and monitoring purposes) have interaction.

The main work flow is introduced:

The common quick configuration cluster steps are as follows:

1) Select the Hadoop version

2) Select Mirror (if no pre-installed Hadoop,sahara in the mirror is also supported via pluggable deployment engine)

3) Set the parameters of the cluster: size, topology, etc.

4) Create the cluster: The Sahara will perform the installation of the virtual machine and the configuration of Hadoop.

5) Cluster Management: includes adding or removing nodes.

6) Delete the cluster

Common Analytical Services Workflow:

1) Select a pre-defined Hadoop version

2) Editing tasks

A) Select the task type: Pig, Hive, jar-file, etc.

b) Provide the script address of the task or the location of the jar package

c) Select the location of the input/output data

d) Select the location of the log

3) Set the size of the cluster

4) Perform the task

5) Get Task execution results

Sahara System Architecture Diagram:

The Sahara architecture contains several modules:

Authentication module: Responsible for authentication and authorization, and Keystone Exchange.
DAL (data access Layer): Related to database access.
Supply engines (Provisioning engine): For and Components Nova, Heat, Cinder, glance switching
Vendor plug-in: A plug-in form for configuring and starting a Hadoop service on a virtual machine. Existing solutions include: Apache Ambari and Cloudera (Hadoop data management software and service provider) Management Console.
EDP (Elastic Data Processing): Responsible for scheduling and managing compute tasks on Hadoop clusters provided by Sahara.
Rest API: Provides rest using the Sahara feature.
Sahara Python client: Same as the CLI for other OpenStack components.
GUI page for Sahara: Sahara related GUI is available on horizon.

Resources

Http://docs.openstack.org/developer/sahara/overview.html

Http://docs.openstack.org/developer/sahara/architecture.html

Basic concepts and architecture of Sahara

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Basic concepts and architecture of Sahara

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support