Slurm Submit MPI Operations
first, prepare a MPI program, which uses the Python language's mpi4py library to write a
helloworld.py
#!/usr/bin/env python "" "Parallel Hello World" "
mpi4py import MPI
import sys
Time
size = MPI.COMM_WORLD.Get_size ()
rank = MPI.COMM_WORLD.Get_rank ()
name = MPI. Get_processor_name ()
sys.stdout.write ("Hello, world! I am is in process%d of%d on%s.\n "% (rank, size, name)
Time.sleep (300)
Slurm Submit Job Script
helloworld.sh
#!/bin/sh
#SBATCH-o/apps/mpi/myjob.out
#SBATCH--nodes=2
#SBATCH--ntasks-per-node=2
mpirun python/apps/mpi/helloworld.py
Slurm Submit MPI Operations
$ Sbatch helloworld.sh
View MPI Job information
view MPI Job Status
$ squeue
jobid PARTITION NAME USER ST time NODES nodelist (REASON)
control Hellowor Jhadmin R 3:06 2 centos6x[1-2]
View MPI Job Details
$ scontrol Show Jobs jobid=40 jobname=helloworld.sh userid=jhadmin groupid=jhadmin mcs_label=n/a 4294901724 nice=0 account= (null) qos= (NULL) jobstate=completed reason=none (null) dependency= requeue=1 Ba Tchflag=1 reboot=0 exitcode=0:0 runtime=00:05:01 timelimit=unlimited timemin=n/a SubmitTime=2016-09-12T04:27:00 Elig ibletime=2016-09-12t04:27:00 starttime=2016-09-12t04:27:00 endtime=2016-09-12t04:32:01 Deadline=N/A PreemptTime= None suspendtime=none secspresuspend=0 partition=control allocnode:sid=centos6x1:2239 ReqNodeList= (null) ExcNodeList
= (null) nodelist=centos6x[1-2] batchhost=centos6x1 numnodes=2 numcpus=4 numtasks=4 cpus/task=1 ReqB:S:C:T=0:0:*:* tres=cpu=4,node=2 socks/node=* ntaskspern:b:s:c=2:0:*:* corespec=* mincpusnode=2 minmemorynode=0 MinTmpDiskNode= 0 features= (NULL) gres= (NULL) reservation= (NULL) Oversubscribe=ok contiguous=0 licenses= (null) network= (NULL) Com mand=/apps/mpi/helloworld.sh
Workdir=/apps/mpi stderr=/apps/mpi/myjob.out stdin=/dev/null stdout=/apps/mpi/myjob.out Power=
MPI Output Information
$ cat/apps/mpi/myjob.out
srun:cluster configuration lacks support for CPU binding
Hello, world! I am Process 0 of 4 on centos6x1.
Hello, world!. I am Process 1 of 4 on centos6x1.
Hello, world!. I am Process 2 of 4 on centos6x2.
Hello, world!. I am Process 3 of 4 on centos6x2.
Job Process Information
centos6x1
PSTREE-APL 6290
slurmstepd,6290
├─slurm_script,6294/tmp/slurmd/job00040/slurm_script
│ └─mpirun , 6295 python/apps/mpi/helloworld.py
│ ├─python,6306/apps/mpi/helloworld.py │ │ └─{python}, 6309
│ ├─python,6307/apps/mpi/helloworld.py │ │ └─{python},6308 │ ├─srun, 6297--ntasks-per-node=1--kill-on-bad-exit--cpu_bind=none--nodes=1--nodelist=centos6x2--ntasks=1 ORTED-MCA orte_ ess_jobid37944
│ │ ├─srun,6300--ntasks-per-node=1--kill-on-bad-exit--cpu_bind=none--nodes=1-- nodelist=centos6x2--ntasks=1 orted-mca orte_ess_jobid37944
│ │ ├─{srun},6301
│ │ ├─{srun},6302 │ │ └─{srun},6303
│ └─{mpirun},6296
├─{slurmstepd},6292
└─{ slurmstepd},6293
centos6x2
PSTREE-APL 4655
slurmstepd,4655 ├─orted,4660-mca orte_ess_jobid 3794403328-mca orte_ess_vpid 1-MCA orte_ess_
Num_procs 2-MCA Orte_hnp_uri "3794403
│ ├─python,4663/apps/mpi/helloworld.py │ │ └─{python}, 4665
│ └─python,4664/apps/mpi/helloworld.py
│ └─{python},4666
├─{slurmstepd},4657
├─{slurmstepd},4658
└─{slurmstepd},4659
another way to submit a MPI job
$ salloc-n 8 mpiexec python/apps/mpi/helloworld.py
...
Hello, world!. I am Process 1 of 8 on centos6x1.
Hello, world!. I am Process 0 of 8 on centos6x1.
Hello, world!. I am Process 3 of 8 on centos6x1.
Hello, world!. I am Process 2 of 8 on centos6x1.
Hello, world!. I am Process 4 of 8 on centos6x2.
Hello, world!. I am Process 6 of 8 on centos6x2.
Hello, world!. I am Process 7 of 8 on centos6x2.
Hello, world!. I am Process 5 of 8 on centos6x2.
Job Process Information
centos6x1
$ PSTREE-APL 8212
salloc,8212-n 8 mpiexec python/apps/mpi/helloworld.py ├─mpiexec,8216 python/apps/mpi/
helloworld.py
│ ├─python,8227/apps/mpi/helloworld.py │ │ └─{python},8231 │ ├─python,8228/apps/mpi/helloworld.py │ │ └─{python},8232
│ ├─python,8229/apps/mpi/ helloworld.py │ │ └─{python},8233
│ ├─python,8230/apps/mpi/helloworld.py │ │ └─{python},8234
│ ├─srun,8218--ntasks-per-node=1--kill-on-bad-exit--cpu_bind=none--nodes=1-- nodelist=centos6x2--ntasks=1 orted-mca orte_ess_jobid36682
│ ©¦ ├─srun,8221--ntasks-per-node=1-- Kill-on-bad-exit--cpu_bind=none--nodes=1--nodelist=centos6x2--ntasks=1 orted-mca orte_ess_jobid36682
│ │ ├─{srun},8222 │ │ ├─{srun},8223 │ │ └─{srun},8224 │ └─{ mpiexec},8217
└─{salloc},8213
centos6x2
$ PSTREE-APL 6356
slurmstepd,6356 ├─orted,6369-mca orte_ess_jobid 3668246528-mca orte_ess_vpid 1-MCA orte_ess
_num_procs 2-MCA Orte_hnp_uri "3668246
│ ├─python,6372/apps/mpi/helloworld.py │ │ └─{python}, 6376
│ ├─python,6373/apps/mpi/helloworld.py │ │ └─{python},6378 │ ├─python, 6374/apps/mpi/helloworld.py │ │ └─{python},6377
│ └─python,6375/apps/mpi/ helloworld.py
│ └─{python},6379
├─{slurmstepd},6366
├─{slurmstepd},6367
└─{SLURMSTEPD },6368
Reprint please indicate this link form link
This article link: http://blog.csdn.net/kongxx/article/details/52592677