Shen Teng 7000 uses the bsub command to submit the MPI job

Source: Internet
Author: User

Small-scale jobs run on thick node queues. Configuration:

The three queue nodes are the same, with 16-Core 4-core 64-core Xeon x7350 2.93 GHz and GB memory.
X64_small: 2 nodes, 1-8 cores, 6 hours
X64_3950 5 nodes in total 1-64 cores 6 hours
X64_3950_long 11 nodes in total: 1-64 cores 144 hours

X64_small is used to run small jobs.
To run a slightly larger job, use x64_3950 or x64_3950_long. In fact, the resource usage of these two queues is more idle than that of x64_small.

A large number of jobs run in the blade queue, with a limit of over 64 Cores


The job runs for 1 minute and requires two CPU cores. One CPU core is used on a single node and submitted to the x64_small queue. The standard output file is zlt. out. The error output file is zlt. err, run the program name COMM:

[scwangj@LB270108 zjl]$ bsub -W 1 -a intemmpi -n 2 -R "span[ptile=1]" -q x64_small -o zlt.out -e zlt.err mpirun.lsf ./comm
Job <78607> is submitted to queue <x64_small>.

Submit multiple jobs at a time and write a bash script submit. Sh [in fact, this is not necessary. The command overlay method is also good]:

#!/bin/bashfor i in 50 60 70 80 90 100do    bsub -W 6 -a intemmpi -n $i -R span[ptile=1] -q x64_blades -o $i.out -e $i.err ./matrixdone

View job:

[scwangj@LB270210 zl]$ bjobs -u scwangjJOBID    USER    STAT  QUEUE      FROM_HOST   EXEC_HOST   JOB_NAME   SUBMIT_TIME82504    scwangj PEND  x64_blades lb270210                ./matrix   Jul  4 12:4782505    scwangj PEND  x64_blades lb270210                ./matrix   Jul  4 12:4782506    scwangj PEND  x64_blades lb270210                ./matrix   Jul  4 12:4782507    scwangj PEND  x64_blades lb270210                ./matrix   Jul  4 12:4782487    scwangj PEND  x64_small  lb270210                ./matrix   Jul  4 12:35[scwangj@LB270210 zl]$ 

Appendix:

[scwangj@v3903 20x20x100]$ cat submit.sh #!/bin/bashfor i in 1 2 3 4 5 6 7 8do    bsub -W 5:40 -a intelmpi -n $i -R span[ptile=2] -q x64_small -o $i.out -e $i.err mpirun.lsf ./simpledone[scwangj@v3903 20x20x100]$ cd ..[scwangj@v3903 ddm]$ ls10.err  18.err  1.err  20x20x100  2.out  9.out  bsubmpi   ddm.sh  fluid.grd   serial  solveuss.F  solvewss.F  stagsimple.F  submit.log  tdma.F    uc.fun  variable.mod10.out  18.out  1.out  2.err      9.err  a.sh   bsub.txt  del.sh  ppoisson.F  simple  solvevss.F  s.sh        submit2.sh    submit.sh   time.dat  uc.nam[scwangj@v3903 ddm]$ cat submit.sh #!/bin/bashfor i in 1 2 3 4 5 6 7 8do    bsub -W 5:40 -a intelmpi -n $i -R span[ptile=2] -q x64_small -o $i.out -e $i.err mpirun.lsf ./simpledone[scwangj@v3903 ddm]$ cat submit2.sh #!/bin/bashfor i in  9 10 11 12 13 14 15 16 17 18do    bsub -W 5:40 -a intelmpi -n $i -R span[ptile=9] -q x64_3950 -o $i.out -e $i.err mpirun.lsf ./simpledone[scwangj@v3903 ddm]$ bjobs -u scwangjJOBID    USER    STAT  QUEUE      FROM_HOST   EXEC_HOST   JOB_NAME   SUBMIT_TIME82725    scwangj RUN   x64_3950   v3903       9*t3701     * ./simple Jul  4 20:39                                              2*t360182726    scwangj RUN   x64_3950   v3903       9*t3802     * ./simple Jul  4 20:39                                              3*t410282727    scwangj RUN   x64_3950   v3903       9*t3701     * ./simple Jul  4 20:39                                              4*t380282728    scwangj RUN   x64_3950   v3903       9*t3601     * ./simple Jul  4 20:39                                              5*t410282729    scwangj RUN   x64_3950   v3903       9*t3701     * ./simple Jul  4 20:39                                              6*t380282730    scwangj RUN   x64_3950   v3903       9*t3601     * ./simple Jul  4 20:39                                              7*t410282731    scwangj RUN   x64_3950   v3903       9*t3701     * ./simple Jul  4 20:39                                              8*t380282745    scwangj RUN   x64_3950   v3903       9*t3701     * ./simple Jul  4 20:4182634    scwangj RUN   x64_small  v3903       t4601       * ./simple Jul  4 16:5982635    scwangj RUN   x64_small  v3903       1*t4601     * ./simple Jul  4 16:59                                              1*t370182710    scwangj RUN   x64_small  v3903       t4601       * ./simple Jul  4 20:3782711    scwangj RUN   x64_small  v3903       2*t3701     * ./simple Jul  4 20:3782746    scwangj PEND  x64_3950   v3903                   * ./simple Jul  4 20:4182619    scwangj PEND  x64_small  lb270210                * ./matrix Jul  4 16:53[scwangj@v3903 ddm]$ 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.