Build a distributed computing cluster using openmpi + NFS + NIS

Source: Internet
Author: User
Tags account security node server

Build a distributed computing cluster using openmpi + NFS + NIS

1. configure the firewall

Correctly configure the firewall filtering rules. Otherwise, the NFS file system fails to be mounted, the NIS account authentication fails, and the mpirun remote task instance fails to be deployed. Generally, the computing cluster is used in the internal LAN, so you can directly close the firewall of all node servers without taking into account security issues.

The related commands are as follows:

Service iptables stop # Or
/Etc/init. d/iptables stop # the above two methods take effect immediately, but are restored after restart
# Or
Chkconfig iptables on # takes effect permanently after restart

2. Configure the ing between the cluster LAN ip address and Host Name

For convenience, you may need to change the node host name to node1, node2, node3 ...... The command to modify the host name is:

Hostname node1 # change the host name to node1, but the host will become invalid after it is restarted.

The Permanent modification method is to modify the hostname line in the/etc/sysconfig/network file:

HOSTNAME = node1

Modify the file/etc/hosts in each node and write the correspondence between the host name and ip address of each node in the cluster.

3. Configure the NFS Shared File System

Distributed Parallel Computing generally requires that the application software environment of each node server be consistent with the working directory environment. It is especially troublesome to configure it on each node. Therefore, using the NFS shared file system to deploy the application software and working directory in the public directory can effectively solve this problem. All node servers can be accessed only once.

First, install the nfs suite on all nodes and run the following command:

Yum install nfs

Then, select a node server with a large hard disk storage, such as node0, and use it as the NFS server for configuration. The specific method is to first configure the/etc/exports file and write it in this file:

/Tmp node * (rw, no_root_squash) # mount the tmp directory in rw format on the server that allows the host name to be node * (* representing the wildcard.

Then run the following command on the NFS server node:

Exportfs-ar # Run this command every time you modify the/etc/exports file.
Service nfs start # start the nfs service

To configure other node servers as NFS clients, run the following command:

Service nfs start # start the nfs service
Mount-t nfs node0:/share # mount the/share directory of the strong nfs server (that is, node0) to the local/share directory

You can modify the/etc/fstab file to enable automatic mounting upon startup. In this file, enter a line:

192.168.44.130:/share nfs ults 0 0

Other related commands:

Showmount-e 192.168.0.30 # Use this command on the client to check the directory export on the NFS SERVER
Showmount-a # is generally used on the nfs server to display the client machines that have mounted the nfs Directory of the local machine.
Chkconfig -- level 35 nfs on # configure to automatically start the nfs service upon startup

4. Configure the NIS Service

Distributed parallel computing requires the account information environment on each node server to be consistent. If user information is configured on each node server, the workload is too large and repetitive. This problem can be solved by configuring an NIS server. All Hosts can search for user information on the NIS server for account authentication. NIS (Network information service) is also called YP (Yellow Pages, phone book ).

First, install the NIS kit on all computing nodes. The command is as follows:

Yum install yp *
Yum install xinetd

Modify/etc/xinetd. d/time on all nodes to make disable = no. Then run the following command:

Service xinetd restart # Start the xinetd service
Nisdomainname cluster # Set the name of the NIS domain.

Modify the/etc/sysconfig/network file on all nodes and add a line:

NISDOMAIN = cluster

Select a node server, such as node0, and configure it as the NIS server. Configure the/etc/ypserv. conf file and add three lines:

127.0.0.0/255.255.255.0: *: none
192.168.0.0/255.255.255.0: *: none
*: Deny

192.168.0.0 indicates the CIDR block, which must be filled in according to the specific network configuration.

Create an account database and run the following command:

/Usr/lib64/yp/ypinit-m # When adding a user, you only need to add the user on the NIS server, and then execute/usr/lib64/yp/ypinit-m to update the database.

After creating a database, start the services ypserv and yppasswdd:

Service ypserv start
Service yppasswdd start
Chkconfig -- level 35 ypserv on # Start the service
Chkconfig -- level 35 yppasswdd on # Start the service

Other compute node servers are configured as NIS client. First, configure/etc/yp. conf to add two lines:

Nisdomain cluster # Set the name of the NIS domain.
Ypserver node0 # Set the name of the NIS server, where node0 is set

Configure/etc/passwd to add one line:

+: ::::# Note the number of colons.

Configure/etc/nsswitch. conf and add the following four lines:

Passwd: files nis nisplus
Shadow: files nis nisplus
Group: files nis nisplus
Hosts: files nis dns

Finally, run the following command:

Service ypbind restart # Start the service
Chkconfig -- level 35 ypbind on # How to automatically start ypbind upon startup

5. Configure ssh login without a password

If the home directory is not configured in the shared file system, if you want host B to log on to host A without A password, you need to configure host A and create it in host A's home directory. run the following command on the ssh directory after cd is entered:

Ssh-keygen-t rsa # Keep the Enter key and save the generated key in the. ssh/id_rsa file by default.
Cp id_rsa.pub authorized_keys # After completing this step, you can log on to the local machine without a password.
Scp authorized_keys test @ B:/homename/. ssh # copy the generated authorized_keys file to host B.
Chmod 700 ~ /. Ssh
Chmod 600 ~ /. Ssh/authorized_keys
Chmod 600 authorized_keys: Enter the. ssh directory of host B and change the permission of the authorized_keys file.

According to the above steps, only B can access A without A password. Therefore, to enable each node in the cluster to access each other without A password, you need to configure the two nodes in pairs according to the above steps, the workload is extremely heavy.

If the home directory is configured in the shared file system, it is much simpler to execute the following command to allow each node in the cluster to access each other without a password.

?
1
2
3
4 ssh-keygen-t rsa
Cp id_rsa.pub authorized_keys
Chmod 700 ~ /. Ssh
Chmod 600 ~ /. Ssh/authorized_keys

In addition, add StrictHostKeyChecking no to the/etc/ssh/ssh_config file. In this way, during the first ssh Login, the system will no longer prompt whether to add the host to known hosts.

6. install and configure openmpi

The configuration of the first version of openmpi is as follows. If you use the intel compiler, you must first install the intel compiler and then execute the command:

./Configure CC = icc CXX = icc FC = ifort -- prefix =/opt/openmpi/-- enable-static -- enable-mpi-cxx ps: You must create a new directory as the installation directory.

If you use the built-in default compiler, run the following command:

./Configure -- prefix =/opt/openmpi/-- enable-static -- enable-mpi-cxx ps: You must create a new directory as the installation directory.

Finally, compile openmpi with the following command:

Make all install

7. install and configure the Server Load balancer System (optional)

If you want to add the job scheduling function, you also need to install lsf and other software. The configuration of these software is load-intensive, and generally it is not necessary to use small clusters, so I will not go into details here.

This article permanently updates the link address:

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.