Hadoop and other applications that consume different types of resources deploy http://www.aliyun.com/zixun/aggregation/6267.html "> Shared data Center can improve overall resource utilization;
Flexible virtual machine operations enable users to dynamically create, expand their own Hadoop clusters based on datacenter resources, or reduce current clusters and release resources to support other applications if needed;
The integration of HA and FT with the virtualization architecture avoids single point failures in traditional Hadoop clusters, plus the data reliability of Hadoop itself, providing a reliable guarantee for large data applications in the enterprise.
For these reasons, vsphere Big Data Extensions (BDE) provides effective support for users to flexibly deploy and manage Hadoop clusters in virtualized environments. Aside from these advantages, will virtualization hurt the performance of Hadoop running? To this end, we do the same scale of virtualization deployment and physical deployment of the Hadoop cluster performance comparison and optimization, the experiment shows that the virtualization Hadoop cluster can support the production environment well.
Performance comparisons between virtualized and physical environments
Figure 1 shows the deployment style for the performance tuning test, where only one virtual machine is deployed on a physical server, and Tasktracker and Datanode run together in the same node. Because each virtual node can use all of the server resources, it facilitates the performance comparison and analysis of the virtualization and Hadoop deployed in the traditional physical environment. As shown in Figure 2, the performance comparison of virtualized Hadoop with respect to the physical environment is almost flat.
Figure 3 shows a deployment topology that is more recommended for production environments, with multiple virtual nodes deployed on a single physical server. As shown in Figure 2, this deployment increases resource utilization to achieve higher performance.
At the same time, we embed these experimental experiences into the vsphere BDE deployed Hadoop cluster system configuration, shielding the complexity of performance optimization. Although different data center settings and cluster configurations can lead to different performance, here are some common experiences to create, configure, and expand the Hadoop cluster:
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.