Problem encountered by netizens and solved: An error occurred when mpich2 was running parallel programs on multiple nodes.
I encountered the following problems when using mpich2:
When I run a parallel program for calculating the circumference rate CPI. in C, I want to run this program on several specified nodes, such as host1, host2, and host3. So I write the names of these three nodes in the hostfile,
The following is the running process:
MPD &
Mpicc CPI. C // an executable file named A. Out is generated at this time.
Mpiexec-machinefile hostfile-N 3./A. Out
The following error occurs:
Mpiexec: unable to start all procs; may have invalid machinenames
Remainingspecified hosts:
IP address (host2)
IP address (host3)
The reason is that the MPD on these nodes cannot be connected and thus cannot communicate, which may be caused by SSH or rsh problems.
But this problem can be solved by manually executing the following command: (assuming that the parallel program is compiled on host1, the following command is executed on host1)
MPD &
Mpdtrace-l // The host name and port number are listed here, in the form of
Then log on to each other node in the file and execute the following command: (host2 and host3 here)
MPD-H
Then execute mpdtrace on host1 to view the Host Name of the MPD execution, so that the nodes can be normal and sad.
When you execute mpiexec-machinefile hostfile-N 3./A. Out, you can see the expected results of O (partition _ partition) o...
However, I feel that this method is not the final solution. Further research is needed.
From: http://blog.sina.com.cn/s/blog_4fd6fd310100aimr.html
MPI running program (contact)