Openmpi + NFS + NIS build distributed computing cluster, openmpinis
1.Configure Firewall
Correctly configure the firewall filtering rules. Otherwise, the NFS file system fails to be mounted, the NIS account authentication fails, and the mpirun remote task instance fails to be deployed. Generally, the computing cluster is used in the internal LAN, so you can directly close the firewall of all node servers without taking into account security issues.
The related commands are as follows:
Service iptables stop # or/etc/init. d/iptables stop # the above two methods take effect immediately, but are restored after restart # Or chkconfig iptables on # take effect permanently after restart
2.Configure cluster LANIpHost Name ing
For convenience, you may need to change the node host name to node1, node2, node3 ...... The command to modify the host name is:
Hostname node1 # change the host name to node1, but the host will become invalid after it is restarted.
The Permanent modification method is to modify the hostname line in the/etc/sysconfig/network file:
HOSTNAME=node1
Modify the file/etc/hosts in each node and write the correspondence between the host name and ip address of each node in the cluster.
3.ConfigurationNFSShared File System
Distributed Parallel Computing generally requires that the application software environment of each node server be consistent with the working directory environment. It is especially troublesome to configure it on each node. Therefore, using the NFS shared file system to deploy the application software and working directory in the public directory can effectively solve this problem. All node servers can be accessed only once.
FirstAll nodesTo install the nfs suite, run the following command:
yum install nfs
Then, select a node server with a large hard disk storage, such as node0, and use itNFS serverConfigure. The specific method is to first configure the/etc/exports file and write it in this file:
/Tmp node * (rw, no_root_squash) # mount the tmp directory in rw format on the server that allows the host name to be node * (* representing the wildcard.
Then run the following command on the NFS server node:
Exportfs-ar # Run this command every time you modify the/etc/exports file. Service nfs start # start the nfs service
Other node servers,NFS clientConfigureRun the following command:
Service nfs start # start the nfs service mount-t nfs node0:/share # mount the/share directory of the strong nfs server (that is, node0) to the local/share directory
You can modify the/etc/fstab file to enable automatic mounting upon startup. In this file, enter a line:
192.168.44.130:/share /share nfs defaults 0 0
Other related commands:
Showmount-e 192.168.0.30 # Use this command on the client to check the showmount-a directory export on the nfs server # It is generally used on the nfs server, show the client machine chkconfig -- level 35 nfs on that has mounted the local nfs Directory # Configure Automatic Start of nfs service
4.ConfigurationNISService
Distributed parallel computing requires the account information environment on each node server to be consistent. If user information is configured on each node server, the workload is too large and repetitive. This problem can be solved by configuring an NIS server. All Hosts can search for user information on the NIS server for account authentication. NIS (Network information service) is also called YP (Yellow Pages, phone book ).
FirstAll computing nodesInstall the NIS-related suite. The command is as follows:
yum install yp*yum install xinetd
Modify/etc/xinetd. d/time on all nodes to make disable = no. Then run the following command:
Service xinetd restart # Start the xinetd service nisdomainname cluster # Set the name of the NIS domain.
Modify the/etc/sysconfig/network file on all nodes and add a line:
NISDOMAIN=cluster
Select a node server, such as node0,NIS serverConfigure, Configure the/etc/ypserv. conf file and add three lines:
127.0.0.0/255.255.255.0 : * : * : none192.168.0.0/255.255.255.0 : * : * : none* : * : * : deny
192.168.0.0 indicates the CIDR block, which must be filled in according to the specific network configuration.
Create an account database and run the following command:
/Usr/lib64/yp/ypinit-m # When adding a user, you only need to add the user on the NIS server, and then execute/usr/lib64/yp/ypinit-m to update the database.
After creating a database, start the services ypserv and yppasswdd:
Service ypserv startservice yppasswdd startchkconfig -- level 35 ypserv on # Start the service chkconfig -- level 35 yppasswdd on # Start the service
Other computing node serversNIS clientConfigureFirst, configure/etc/yp. conf to add two lines:
Nisdomain cluster # Set the name of the NIS domain. clusterypserver node0 # Set the name of the NIS server. node0 is set here.
Configure/etc/passwd to add one line:
+: ::::# Note the number of colons.
Configure/etc/nsswitch. conf and add the following four lines:
passwd: files nis nisplus shadow: files nis nisplus group: files nis nisplus hosts: files nis dns
Finally, run the following command:
Service ypbind restart # Start service chkconfig -- level 35 ypbind on # method for automatically starting ypbind upon startup
5.ConfigurationSshLogin without a password
If the home directory is not configured in the shared file system, if you want host B to log on to host A without A password, you need to configure host A and create it in host A's home directory. run the following command on the ssh directory after cd is entered:
Ssh-keygen-t rsa # Keep the Enter key and save the generated key in the. ssh/id_rsa file by default. Cp id_rsa.pub authorized_keys # After completing this step, you can log on to the local machine without a password. Scp authorized_keys test @ B:/homename/. ssh # copy the generated authorized_keys file to host B. chmod 700 ~ /. Sshchmod 600 ~ /. Ssh/authorized_keyschmod 600 authorized_keys enter the. ssh directory of host B and change the permission of the authorized_keys file.
According to the above steps, only B can access A without A password. Therefore, to enable each node in the cluster to access each other without A password, you need to configure the two nodes in pairs according to the above steps, the workload is extremely heavy.
If the home directory is configured in the shared file system, it is much simpler to execute the following command to allow each node in the cluster to access each other without a password.
ssh-keygen -t rsa cp id_rsa.pub authorized_keys chmod 700 ~/.sshchmod 600 ~/.ssh/authorized_keys
In addition, add StrictHostKeyChecking no to the/etc/ssh/ssh_config file. In this way, during the first ssh Login, the system will no longer prompt whether to add the host to known hosts.
6.Install and configureOpenmpi
The configuration of the first version of openmpi is as follows. If you use the intel compiler, you must first install the intel compiler and then execute the command:
./Configure CC = icc CXX = icc FC = ifort -- prefix =/opt/openmpi/-- enable-static -- enable-mpi-cxx ps: You must create a new directory as the installation directory.
If you use the built-in default compiler, run the following command:
./Configure -- prefix =/opt/openmpi/-- enable-static -- enable-mpi-cxx ps: You must create a new directory as the installation directory.
Finally, compile openmpi with the following command:
make all install
7.Install and configure the Server Load balancer System (optional)
If you want to add the job scheduling function, you also need to install lsf and other software. The configuration of these software is load-intensive, and generally it is not necessary to use small clusters, so I will not go into details here.