Pre-build Linux images deploy a multi-node hadoop cluster in Windows Azure

Source: Internet
Author: User
Keywords Azure hadoop azure linux image multi-node

This article is written by ">azure CAT team Piyush Ranjan (MSFT).

As infrastructure services (virtual machines and virtual networks) are recently officially released on Windows Azure, more and more enterprise workloads are migrating to the public cloud to take advantage of cloud profitability, scale, and speed. I recently participated in one of the enterprise workloads-large data in the cloud. Here, I'll share some tips and best practices with you.

This project requires the deployment of a multi-node Hadoop cluster in Windows Azure using a pre-built Linux image. I configured a medium virtual machine (VM) using the CentOS 6.3 image in the Windows Azure Mirror Library and continued to deploy a single node core Hadoop. Everything is fine, but when I start to test a slightly heavier workload, I find that VMS often freeze or become unresponsive.

It's not hard to see that this has to do with the resources of a midsize VM-after all, it has only 2 CPU cores and 3.5 GB of memory. But I didn't expect the entire VM to become unresponsive or even start dropping. After discussing this issue with my friends and colleagues, we are convinced that the VM does not have the swap space configured at all (that is, the paging file that is spoken on Windows). Therefore, when the memory pressure increases, its virtual memory system cannot be swapped to disk.

You can check how the system is running the free command at the Linux shell prompt to see how the system uses memory, in particular, you can use "cat/proc/swaps" to view the status of the Swap space – the configured memory size, and the amount of memory in use. See screenshot below.

By default, swap space is not configured for the Linux VM configured in the Windows Azure virtual machine, so "cat/proc/swaps" does not return any content, and the "free" command does not show any activities that are being exchanged.

An interesting question is why the VM configuration using a Linux library image (that is, an image from the Windows Azure Mirror Library) does not automatically configure swap space. We thought about it. This is because the user should decide the size and location of the swap space and configure it later. However, it is most likely that a user will continue to use a VM that has never configured swap space until the process begins to crash or the VM freezes.

That is, once we realize that what we need to do is configure the swap space, immediately configure the file-based swap space on the resource disk in a series of simple steps; the medium virtual machine in Windows Azure is equipped with a 135 GB resource disk and is installed as "/mnt/resource". The steps for configuring file-based swap space on a VM are described below.

uses the "fallocate" command to allocate an appropriate sized interchange file, such as allocating 5GB on a resource disk. Syntax is: "Fallocate-l 5g/mnt/resource/swap5g", where "swap5g" is the file name to change the permissions of the file using the "chmod" command, ensuring that only the root user has read/write access to the swap file. The syntax is: "Chmod 600/mnt/resource/swap5g" uses the "Mkswap" command to set the file as a swap area. Syntax: "mkswap/mnt/resource/swap5g" uses the "swapon" command to enable swap files. Syntax is: "swapon/mnt/resource/swap5g" Now, swap space available, and through the "cat/proc/swaps" command should be able to confirm. Add entries to the "fstab" file so that you can retain the swap settings even if the VM is reclaimed in Azure. Syntax: echo "/mnt/resource/swap5g none swap SW 0 0" >> fstab

The following figure is a record of the above command executed on my VM.

Acknowledgements: Thanks to my colleague Amit Srivastava for helping me troubleshoot and resolve Exchange issues.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.