Following the "0 start deployment of large data virtualization" series of tutorials, the spirit of "know it, but also know why" principle, this series into the large data virtualization inside, divided into two posts to help readers understand vsphere Big data Extensions (hereinafter referred to as BDE) the deployment architecture and system architecture, understand the deployment principle and internal composition, as well as their respective roles. Hope to help you, but also welcome your message evaluation.
On: Serengeti Virtualization Application (that is, this article)
Under: Serengeti Management Server's system architecture
Application of Serengeti Virtualization
Vsphere Big Data Extensions (BDE) is a corporate release of VMware based on Serengeti Open source technology. The focus is on enhancing the support of the vsphere infrastructure for Serengeti to better deploy, run, and manage large data-related workloads.
From a deployment perspective, BDE Packages Serengeti virtualization applications and includes a plug-in for vcenter Web-page clients.
Serengeti Virtualization Applications (virtualappliance) include Serengeti Management Servers and virtual machine templates. This application can be easily deployed on top of VMware's vcenter.
Large data Virtualization (BDE/SERENGETI) Deployment structure diagram
The Serengeti Management Server is the core component of the entire Serengeti, providing the ability to deploy and manage the Hadoop cluster in a virtualized environment. and provide different resource usage policies for different users.
Customers with high resource utilization requirements can effectively share resources between Hadoop applications and other applications with the help of Serengeti. And for the performance of Hadoop has a higher demand for customers, it can be achieved through the Serengeti resources in different applications perfect isolation, to achieve in the case of resource exclusive optimal use of the effect.
The Serengeti Management Server provides RESTAPI for remote clients to access and control the Hadoop cluster. The UI Plug-ins for both SERENGETICLI and BDE are accessed through the RESTAPI Access Serengeti Management Server.
All of the virtual machines in the Hadoop cluster are copied directly or indirectly from the Serengeti virtual machine templates, including a basic CentOS operating system, and simple cluster installation software requirements. However, it does not include the Hadoop installation package because Serengeti can support multiple versions of the Hadoop distribution, which are installed during the cluster creation process.
After the deployment is complete, the Serengeti Management Server runs as a virtualized application on a single virtual host. and register as an expansion server for vcenter. Later, Serengeti will establish SSL links with Vcenter to ensure the security and reliability of data interactions.
Not to be continued ... Then we will discuss the next article-vsphere Big Data extensions/serengeti system architecture, please look forward to!
Author Introduction
Shu Yonghua (Emma Lin)
VMware Advanced Development Engineer, Staff Engineer
As the technical leader of VMware large data products vsphere BDE, Serengeti Open Source project, lead and participate in the design and development of Serengeti core architecture and function, and experience the development and release of 6 versions since the advent of Serengeti. Long-term commitment to make VMware virtualization infrastructure the best choice for large data applications through vsphere Bde/serengeti. Prior to VMware, worked in the Bea/oracle Software Development Center, has long been engaged in the development of distributed systems, design work, has a wealth of enterprise software development experience.
See more highlights of this column: http://www.bianceng.cnhttp://www.bianceng.cn/Servers/virtualization/