Recently, in order to do experiments, we need to configure the cluster, although the 10 nodes of the cluster is not much, but still use it to reduce the burden. But in the use of the process, found that although there is information on the Internet, but most of them have not been able to solve the problems encountered in the use of the process, to do a record, for follow-up reference, but also look at the expert pointing ~
First, we have a list of node group IPs here:
1 172.31.42.68~172.31.42.77
First step : Pre-order
For the subsequent operation more coherent, to be configured between nodes without password, we chose. 68 of the machine is master, the other nodes as slave, no special instructions for follow-up, the operation is on this node.
1. Generate public and private keys
1 ssh-keygen-t RSA
In this process, press ENTER directly, will produce the machine's public key pair, saved under ~/.ssh.
2. Distribute Master's public key to each slave node
1ip=172.31. -.2 forIinch$(seq the the)3 Do4 SSH$ip $i-cmkdir/home/edmonds/.SSH5 SCP~/.SSH/id_rsa.pub $ip $i:/home/edmonds/.SSH/Authorized_keys6 Done
Through the above two steps, you should have been able to access the slave node from master without password, from the slave node without password access to the master node, and other parallel SSH tools installed after the configuration
Step two : Install parallel ssh and make a simple configuration
1. Download from reference 2 and install:
1 git clone http://code.google.com/p/parallel-ssh/2 cd parallel-ssh 3 python setup.py build 4 sudo Install
2. Simple configuration
1 Touch~/Slaves_list.txt2Vim ~/Slaves_list.txt3 #insert following content4 172.31.47.695 172.31.47.706 172.31.47.717 172.31.47.728 172.31.47.739 172.31.47.74Ten 172.31.47.75 One 172.31.47.76 A 172.31.47.77
Step three : use
1. Remote Installation package
In the process of cluster management, you will often encounter the need to install packages. As an example of Ubuntu system, we know that in the process of sudo apt-get Install, you need to enter Y to confirm the installation, in the PSSH environment, the following processing is required:
1 pssh-h slaves_list.txt-p "sudo apt-get install-y g++"
At this point, perhaps due to the network of your slave machine or other reasons, may cause a timeout, this time the PSSH client will end the process, so if you can not guarantee the completion of the command soon, then specify the timeout, as follows
1 pssh-h slaves_list.txt-t -P "sudo apt-get install-y libboost-dev"
Here the 1200 is calculated in seconds, as for the other parameters directly pssh--help Bar ~
2. Remote multi-Command execution
Sometimes you need to do multiple commands, for clarity, as follows:
1 pssh-h slaves_list.txt-t 12000 -P "CD ~/soft/tbb43_20141204oss/build;chmod +x *. Sh;sh generate_tbbvars.sh;sh tbbvars.sh"
In the use of the process is mainly encountered in the above two problems, one is the node machine need to input operation how to do, with-xxx to specify, one is multi-command how to complete a single assignment.
As for the other commonly used tools such as Pscp,psync,pslurp,pnuke are very similar, here no longer an example, there is a need to see the help is done ~
Reference:
http://www.forzw.com/archives/671
https://code.google.com/p/parallel-ssh/
Cluster batch management tool parallel SSH installation and use