Use fabric to deploy Baidu BMR spark cluster nodes

Last Update:2018-02-10 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Use fabric to deploy Baidu BMR spark cluster nodes
Preface

The AI competition that I attended with my friends entered the finals for a while, and I have been imagining to combine the data preprocessing process with the deep learning phase. However, when merging the two parts of the Code, some problems were encountered, so the script file was specially written for processing. Now that the competition is over, I think we should write this part to see if other friends may encounter this problem and hope to help them. If anything is wrong, please correct me. Thank you!

A prerequisite for deploying each node of the cluster during the competition

During the preliminary round, in order to quickly provide data interfaces for subsequent deep learning models to be created and used, we made data preprocessing independent and used the simplest Python operation. Here, considering that our code needs to be transplanted to the computer used by the judges for verification, there may be situations where some libraries are not imported, and finally the program fails to run.

Troubles

After entering the finals, we had to re-import our database because of the urgent need to combine the two parts. However, if we are only on the Master node, the problem is very simple. We can pack the database directly and write a script. In Baidu's BMR spark cluster, because the Slaves node cannot access the network (for example), we need to log on to the Master node and then ssh to Slaves through the Master intranet, then we can open our script to deploy the program running environment.

Proposal

In this case, is there a good way to run a script on the Master to automatically deploy the running environment of all nodes in the entire cluster? After reading the spark best practices book, I learned about the third-party Python library fabric.

Fabric

First, you can introduce fabric. For details about how to use fabric, refer to the official API Doc. Here I will introduce a small part of fabric.

Execute local tasks

Fabric provideslocal("shell"). Shell is the shell command on Linux. For example

From fabric. api import local ('ls/root/') # list of files in the ls root folder

Execute a remote task

Fabric is powerful in that it does not execute commands locally, but can execute commands on a remote server instead, even if fabric is not installed on the remote server. It is implemented through ssh, so we need to define three parameters:

env.hosts = ['ipaddress1', 'ipaddress2']env.user = 'root'env.password = 'fuckyou.'

You can userun("shell")To execute the tasks we need on the remote server. For example

From fabric. api import run, env. hosts = ['ipaddress1', 'ipaddress2'] env. user = 'root' env. password = 'fuckyou. 'Run ('ls/root/') # list of files in the ls root folder

Open a folder

Sometimes we need to accurately open a folder and then execute a script or file under the file. Here, we need to use the following two interfaces:

Local

With LCD ('/root/local/'): local ('cat local.txt ') # cat local.txt file under'/root/local /'

Remote

With cd ('/root/distance/'): run ('cat distance.txt ') # cat remote'/root/distance/'distance.txt File

Execute a fabric task

We can use the command line

Fab -- fabfile = filename. py job_func # filename. py is a Python file written using fabric # job_func is a function with fabric, that is, the main function to be executed # the above two names can be obtained by themselves. In the following introduction, I am a job. py and job

Socket

Why is socket used? In the previous article, I mentioned that in the Baidu BMR cluster, they set the Server Load balancer cluster to use the Server Load balancer hostname instead of the ip address. Because the ip address is required when you use fabric to set the hosts in the environment, you must use the hostname to locate the ip address.

You may have doubts, why not directly set the Slaves IP address? However, each time Baidu BMR creates a spark cluster, the Intranet IP addresses it provides are constantly changing and the IP ends are increasing.

In summary, we still use hostname to get the IP address.

Gethostbyname Interface

We can usegethostbyname('hostname')Interface, input the hostname, and then get an IPV4 IP address.

Use fabric to write an automatic deployment script for each node to obtain the Server Load balancer hostname.

As mentioned in the previous article, our Server Load balancer hostname is stored in Baidu BMR:

'/opt/bmr/Hadoop/etc/hadoop/slaves'

Convert the hostname to an ip address and set the fabric env parameter.

host_list = []f = open(path, 'r')slaves_name = f.read().split('\n')for i in range(1, slaves_name.__len__()-1):    temp_name = slaves_name[i]    temp_ip = socket.gethostbyname(temp_name)    ip_port = temp_ip + ":22"    host_list.append(ip_port)    del temp_name    del temp_ip    del ip_portenv.user = 'root'env.password = '*gdut728'env.hosts = host_list

Compile the job to be automatically deployed

Here, what I want to automatically deploy is:
1. Download the Python third-party library jieba
2. decompress the downloaded jieba package locally.
3. Go to the decompressed folder and install jieba locally.
4. Transfer the downloaded package to the Slaves node.
5. decompress the downloaded jieba package on the remote end.
6. On the remote end, go to the decompressed folder and install jieba.

Convert the above steps into code, that is

def job():    local_command = "wget https://pypi.python.org/packages/71/46/c6f9179f73b818d5827202ad1c4a94e371a29473b7f043b736b4dab6b8cd/jieba-0.39.zip#md5=ca00c0c82bf5b8935e9c4dd52671a5a9"    local(local_command)    jieba_unzip = "unzip jieba-0.39.zip"    jieba_path = "/root/jieba-0.39/"    jieba_install = "python setup.py install"    local(jieba_unzip)    with lcd(jieba_path):        local("ls")        local(jieba_install)    with lcd('/root/'):        put("jieba-0.39.zip", '/root')    run(jieba_unzip)    with cd(jieba_path):        run("ls")        run(jieba_install)

Statement

Finally, in the shell script I mentioned in the previous article, add

yum -y install fabric && fab --fabfile=job.py job

Input./start-hadoop-spark.shYou can deploy the running environment without any worries. Because of laziness and trouble, I used Python and shell to write the script for automatic deployment. In this process, I learned a lot of knowledge and encountered a lot of troubles. I wrote an article to ease your configuration troubles ~

The result is as follows:
Master:

Slaves1:

Slaves2:

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Use fabric to deploy Baidu BMR spark cluster nodes

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Use fabric to deploy Baidu BMR spark cluster nodes

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support